How does a Bayesian tell what time it is? She starts with an estimated time as her prior and then makes a video for TikTok. If you’ve ever made a joke like that and then realized your audience might need a little statistical education in order to appreciate how hilarious it is (or, perhaps, what the probability is that it’s hilarious), then this episode is for you. The Chatistician (and the creator of the #statstiktok hashtag), Chelsea Parlett-Pelleriti, joined the show to talk about tactics for making statistics accessible, both to ourselves and to others! Humor and thoughtfulness were both normally distributed throughout the discussion.
Articles, Resources, and Ideas Referenced in the Show
- Chelsea Parlett-Pelleriti
- The Chatistician
- Crash Course: Statistics on Youtube
- (Book) The Cartoon Guide to Statistics by Larry Gonick
- (Book) The Phantom Tollboth by Norton Juster
- SQL Murder Mystery
- (Article) Thinking About Election Forecast Uncertainty by Andrew Gelman
- Towards a Principled Bayesian Workflow by Michael Betancourt
- The Measure Slack
00:04 Announcer: Welcome to the Digital Analytics Power Hour. Michael, Moe, Tim, and the occasional guest, discussing analytics issues of the day, and periodically using explicit language while doing so. Find them on the web at AnalyticsHour.io, and on Twitter at Analytics Hour. And now the Digital Analytics Power Hour.
00:27 Michael Helbling: Hi, everyone. Welcome to the Digital Analytics Power Hour. This is episode 149. I’m an analyst. Bayesian, multi-armed bandit. Algorithm, greedy, Bernoulli. Thompson sampling, what’s happening, what’s happening? Of course, Tim, you will immediately recognize my attempt at humor mixed with the classic hit by Megan Thee Stallion entitled Savage. But here’s the deal, we as analysts need to shoulder a pretty considerable load of being able to apply statistics in everyday situations, but even more than that, make those statistical concepts understood by end users of the data. At the best of times, it’s not an easy task, but have no fear. The fine folks here at the Digital Analytics Power Hour are here for you. Moe Kiss is the marketing analytics lead at Canva. There’s supposed to be cheering and applause here, Moe. It just doesn’t show up on the computer.
01:28 MK: Sure! Yeah, hi, how’s it going?
01:31 MH: And Tim Wilson is a senior director of analytics at Search Discovery, and also the… Ah yeah! Woo! The quintessential analyst. And Michael Helbling. I’m the owner and founder of AJL analytics, a strategic analytics consultancy. And then we’ll put in just three claps, like clap… Anyway, each of us grapple with this two-part challenge of statistics in our day-to-day work. Using it effectively and then explaining it to others, so we needed to add another voice. A guest who’s breaking down the walls in the field of statistics in innovative ways. Chelsea Parlett-Pellareti is currently pursuing her doctorate and computational and data science. She has her Master’s in data science, as well, from Chapman University. She is a pioneer in the next generation of statistics, accessibility, and education. She runs a casual statistical consulting service providing statistical advice for your everyday needs, at the Chatistician. And today she is our guest. Welcome to the show, Chelsea.
02:34 Chelsea Parlett-Pelleriti: Thank you. Say statistician one more time. [laughter]
02:36 MH: Statistician. I’ve never… It’s… Yeah… It’s…
02:41 CP: Actually, I have a good story about that. I wrote the series on YouTube Crash Course Statistics, or I mean it’s Crash Course, the statistic series. And the host that they had, had a little bit of trouble saying the word statistics in one of the episodes, and everyone in the comments came for her, and I felt really bad because I mispronounce it all the time, and I live it. [chuckle]
03:06 MH: Well luckily, I’m a guy, so I get a lot more credit.
03:11 MH: It’s not gonna be a problem.
03:12 Tim Wilson: Now, I think… We may be on to something. This may be why so many people have started calling themselves data scientists, is that really it’s just a pronunciation issue.
03:22 MH: It’s so much easier to say.
03:24 MK: Yeah.
03:24 MH: Okay, so I think probably to get us started, Chelsea, talk to a little bit about your background, and how you started intersecting with social media and statistics because you’re sort of… I don’t know if you’re the founder, but you’re certainly the popular riser of the concept of Stats TikTok and those kinds of things, which is kind of its own very unique space. Not withstanding what’s happening with TikTok and all that, maybe we’ll talk about the politics a little, but maybe that’s a good place to start. Where did this find you and how did you become sucked into this vortex?
03:55 CP: Yeah, that’s a great question. I actually did pioneer Stats TikTok because I made up the hashtag. It was basically modeled after stats Twitter, that hashtag that’s really popular. So yes, that was me. So you’re welcome.
04:10 CP: I started TikTok because of lockdown, basically. I didn’t have anything else to do, and so I decided to download TikTok and everything in my life centers around statistics, whether I like it or not. I’m always thinking about things in terms of statistics. I decided why not TikTok? Because I love making video content, but it’s kind of intimidating to make long-form YouTube videos that aren’t pre-recorded lectures that I have to make, and TikTok seemed like the perfect way to capitalize on that, and capitalize that I had nothing to do in lockdown. I started making TikToks and the thing that immediately appealed to me was the fact that there’s a very… What would the correct word be? Mematic, like Meme-like, nature of TikToks, where it’s…
05:08 MH: Wait a minute. Hold on, I think you’ve just pioneered the use of the term mematic as well.
05:13 CP: So, that’s TikTok and mematic, and I’ll tell you what I mean by that. If you go on TikTok, which I don’t know if you have… It’s mostly teenagers.
05:23 MK: I’m terrified! I have no idea how to use it.
05:27 CP: Well, just look at Stats TikTok, that’ll be safe for you. The format that you tend to see is people using the same exact sounds, ’cause that’s what TikTok is, is you’re re-using sounds over and over in different contexts with different people, and I love that. And I’ve talked about this elsewhere, but I think that having this familiar structure of jokes is incredibly useful, both to comedians or whatever, but also to people like me, who are trying to be a little bit educational, or to do outreach, because it makes people feel comfortable and it gives people scaffolding, there we go, to understand topics that they may not otherwise want to engage with. So I really loved that, and that’s why I continued to make them. But to be honest, the reason I started was I was bored.
06:22 MH: So having gone through… or the TikTok, and I swore that I had not been on TikTok. And my 15-year-old daughter lit into me yesterday as we were talking about in… Maybe, or no, it was a couple of nights ago. Were you witnessing this, Michael? When she was telling me that I was… She absolutely remembered me sitting and going through TikTok, sitting on a chair and saying, “I don’t get it.”
06:47 CP: You blocked it out.
06:48 MH: And for the 800th time she was like, “You’re old and out of touch, dad,” and, “Move on.” But do you feel like in the… So I’ve kind of basically done the cheating and gone through the stats TikTok on Twitter hashtag and watched yours, which bubble up, and Dmitri and some others. Are you actually… Feel like that’s conveying the concepts, or are they really just kind of saying, “These are the concepts out there.” It does seem like statistics is hard and they’re entertaining and I’m like, “I recognize those words,” but there’s gonna need to be more digging to understand. I guess, where does it fit? That’s where I’m…
07:32 CP: That’s a great question. I think it fits everywhere because I think if you watch the whole range of stats TikToks that I’ve done, it ranges from, “This is just a joke, it’s not actually teaching you anything,” to, “I’m actually gonna teach you something in less than 60 seconds.” And I think that, on average, it’s falling in the category of, “I’m giving you some tiny bit of information. I am either teaching you something if you didn’t know it or reminding you of something if you did. But most of the learning’s gonna happen not on my TikTok.” And my favorite story of this is when I made my first stats TikTok. It was a joke about the assumptions of linear regression, and I named a couple of the assumptions in the TikTok. And one of my Twitter followers commented and said that his wife had seen the TikTok and had immediately looked up the assumptions of linear regression to kind of more deeply understand the joke I was making. And that was like, “Oh my God, amazing. My favorite thing in the world,” because yeah, I think it definitely provides a way to introduce ideas in that sense. But I will say I have made TikToks. I made one about R that was kind of instructional. I was saying, “Here’s something that I do, and here’s how you can do it and why I do it.”
08:57 MH: Wait, was that the… Was that putting the comments on libraries like the, “reminding yourself?” I’m like, “Oh I totally… ” Especially if it’s like, “Well, I have to do this for a freaking exploded pie chart ’cause that’s what the client wanted.” So I may even get a little catty with them. But I was like, “Yes. Oh wait, a professional says that that’s a good practice.”
09:17 CP: Yeah, I got some push back for that one, but I… That’s what I do in my workflow, and I think it’s very useful so I put it out there.
09:26 MH: I felt validated, so…
09:27 CP: Good, good.
09:28 MK: Chelsea, I just wanted to ask. It sounds like you’re on this real quest to educate the world about stats which I completely love, because I do think lots of people that work in stats want it to almost stay unattainable or not understandable. I don’t know the correct wording, but they want to keep it separate so that they can feel smart and use these sophisticated methods and just be like, “You’re beneath it.” And it sounds like you have a very different approach. What got you so… You can tell by the way that you talk about it, you’re so passionate about this. Where does that come from? Yeah, what drives you?
10:10 TW: Yeah, what’s wrong with you? [laughter]
10:12 CP: Yeah. No, that’s a better question. That’s great. I think you’re asking two questions, so I’ll answer one of them first. The first one is, I agree. I think there can be an attitude of gatekeeping in statistics where we… “Only we understand this and you couldn’t possibly,” and I hate that. My saying is always that we accept everyone but we expect big things, and we expect rigor from them, because I’m not okay with people coming in, not putting in any effort, and doing that analysis. I think that gives us a bad name. It’s impactful in really negative ways and I don’t like that. But I really hate when people take that and they turn it into, “Well, you can’t do this,” or, “No one can do this,” or “Only some people can do this.” So I think that bleeds into the second question I think you’re asking is, why am I like this? [laughter] And I actually didn’t start in statistics. I have a Psychology degree in undergrad. I thought I was gonna be a clinical psychologist and to apply to clinical psychology school, you have to take advanced statistics, at least at my school. And I took it late. I had to beg the professor to let me into his class and fell absolutely in love with statistics. And this was the fourth time I was taking a statistics class, so it wasn’t love at first sight by any means. So I decided that I love statistics and that meshed really well with psychology so I decided to try research out.
11:51 CP: So I actually worked at a research lab between undergrad and starting graduate school where I got to capitalize on all my psychology knowledge but also develop and apply my statistics knowledge that I gained from the couple of classes I’d taken. And I just think it’s beautiful. I think the world is so, in terms of physics and math, chaotic. And statistics allows us to still make decisions to find order in all of that chaos, and so I just find that intrinsically beautiful. But I also saw how useful it was as a researcher and being able to be the person who helps other people find the order in the chaos of their data was so fun and I loved it. And I experienced the gatekeeping that you’re talking about because I didn’t come from inside the field. I didn’t have a math degree. I didn’t have a physics degree or whatever they expect of you. And I was really hurt, not emotionally, but I was impacted negatively by people who were gatekeeping. And I had someone say to me, “You can’t get a PhD in statistics or data science. Just go into quant psych, maybe they’ll take you.” And that was really not great for my [chuckle] career path at the time because I kind of said, “Oh, yeah. Maybe I can’t do it. Maybe it’s too hard for me.”
13:29 CP: And the fact that I have pushed past that somehow really inspires me to help other people push past that more quickly than I did, because I think there’s so many great people and perspectives that we need in the field and they’re not coming, not because they’re not capable, but because someone, for some godforsaken reason, told them they’re not capable even though that’s not true.
13:55 MK: And it sounds like you prefer the term statistician to data scientist.
14:00 CP: I’m an opportunistic data scientist. [chuckle] When the opportunity requires me to be a data scientist, then I’m a data scientist. I definitely have the computer science skills to back that title up. But the reason I call myself a statistician is more because I think it adds nuance to what my skill set actually is, because I see the term data science have very vague and wide-reaching [chuckle] definitions and I think telling people I’m a statistician really allows them to say, “Okay, you’re not the person that’s deploying the super fast model onto our Python server. You’re the person who is creating the model from theory, who is doing the inferential statistics instead of the predictive models.” So it’s not that I wouldn’t call myself a data scientist ever, obviously, my PhD is in that, but I think it better communicates to people what I can actually do.
15:04 TW: It just seems like that there’s the cache or the perceived cache behind data scientists and it took me a while to start to understand that if you take probably the most common Venn diagram that is like data scientist is statistics with computer science with some sort of domain expertise, that it did seem like the pipeline into that was driven… There were more people coming in from computer science and saying, “Oh, well now I learn a new language or two, I point data at it, I get some bare minimum ability to interpret the results of the model,” and to me that, the huge thing that’s missing is a truly fundamental grounding in statistics and an understanding of causal inference and uncertainty and a deeper level. I love that you sort of skewed towards the statistician label ’cause it’s like, “Yeah, that seems to be where there’s a gap,” when there are people with “Hey, no-code data science. Point this at our AutoML tool and it’ll just spit out the results.” And it’s like, “Yeah… ” That’s like saying, “Here are the keys to the Ferrari because you told me you’d seen a key once.” It’s gonna cause problems.
16:25 CP: Totally. I agree. I think it’s not that the other skills aren’t valuable, but I wanna repeat what you said, I think it’s that there is more of a gap for most people in data science in statistics, compared to the other expertise that you need. But yeah, I agree. I think that’s my fear as a statistician and as a stats educator, is that people are gonna keep doing things that are wrong, and not only because they’re wrong, but because they’re harmful. So yeah, I agree. I think people need more stats and that’s another reason why I do what I do, is not everyone is going to take the time to formally learn it in a program of some kind and I don’t think that’s necessary. But I do think that people need [chuckle] more theory in their tool belts when they’re approaching these really important and impactful problems, and I’m hopefully inspiring some of them to pick that up.
17:24 MK: Where do people start, though, and I know we’ve talked about it previously. Whenever I have that, “Ooh, I should spend some more time on stats.” I feel like there is a tsunami of resources, but I actually think it’s really hard to pinpoint, where do you start? And I find that tough.
17:47 CP: Well, what resources have you tried?
17:52 MK: Okay. So I’ve tried…
17:54 MH: Do I get to through my list, too?
17:56 TW: Yeah. We’ll all share.
17:58 MK: I’ve tried some online courses. I tend to stick to stats with R type stuff because I feel like that’s at least in my wheelhouse. I actually emailed a bunch of universities because I was like, I wanted to do… I’m definitely like an in-classroom learner. I was like, “I’d love to do some intro classes at universities,” and they totally shut me down. They were just like, “Yeah, not happening.” And I found that really hard ’cause I’m not good at learning on my own, I like being around people. I actually love a good lecture and then you have time to digest, and go away and do something practical. I have read lots of books. I read that stupid one, The Stats… The Cartoon Guide to Statistics. It is not helpful.
18:36 TW: Wait a second. I’m not saying I’m recommending it, I just… At the time I read it, it was helpful to me.
18:44 MK: I found it very difficult to digest.
18:46 TW: To me, it was just… It was like the QT delivery format, but it was still just… It was like instead of actually reading it, you were having to read in bubbles and I didn’t feel like the drawing de format in the cartoon and talking about poker… I’ve used that reference when I’m trying to explain to people where the modern theory of probability came from but as a pedagogical exploration, it was not useful.
19:09 MH: That’s why we have high hopes for the Mematic future of statistical pedagogy.
19:16 CP: They’re just a supplement. They’re just a supplement, not a replacement for teaching. [chuckle]
19:23 MH: Right, but that’s the future.
19:25 MK: But yeah, and then I’ve tried following people. I’ve tried doing Matt Gershoff’s homework. He’s a friend of Al’s, who’s a statistician that… Yeah. So I have tried lots of things. I just feel like I don’t… I don’t know if I need to drill in more…
19:40 MH: Well, let me say, I’ve got a question ’cause I had in 2001 took a class in statistics as part of business school, and I had a delightful professor, class wasn’t that hard, but it was a lot of building up the… These are the mechanics behind a standard error or a standard deviation and regression, and so I followed all of that along. I was working as it was, and I left that class and was like, “Well, there’s… I don’t know what to do with this in my day-to-day life.” And then I don’t know, eight or nine years later, I wound up taking another college level… I was working in a company where, with Ohio State, I could take their intro to statistics, and that turned out to be kind of just a design of experiments class. And the same thing happened. We’d run through… He loved the example of figuring out Neanderthal something, the guy who taught it. So it was like, “I follow all of this, but then what am I gonna do with it?” Then at my third attempt at it was getting… Talked to a local professor and I got statistics in plain English, and I read through that, same sort of thing, I followed all of it and said, “Now what do I do with it?” Same thing with the cartoon guide to statistics.
20:53 MH: So to me, the part of it is like there’s the basic of starting with the, “If you flip a coin 10 times and it comes up heads every time, what’s the right… ” You start with those examples and there’s some base level intuition you can teach, but that’s not really applied. And then it feels like when you get to applied, you kinda have to have this body of knowledge and that… It just feels like a big chasm to get across, and if anything, I would say learning R, partly because I could run simulations to say, if you’re telling me this is gonna happen, well let me flip 10,000 coins. And that helped a bit, but now I feel like I’m at a point where I have just enough knowledge to be terrified, where I’m like, “I’m gonna misinterpret something here.” So where do you start, if you’re taking somebody who says, “I have the desire to learn and apply it and be responsible.” What’s the starting point in the progression?
21:48 CP: Yeah, that’s a great question. And first of all, I’m still terrified of misinterpreting things, and I’ve spent now probably a little more than half a decade learning how to…
21:58 MH: That’s not what I wanted to hear. [laughter]
22:00 CP: So I don’t think that ever goes away, and actually, as a side note, I think that, that fear is what makes you a good statistician or a good analyst, because if you’re not constantly questioning the appropriateness of your assumptions, you’re gonna do a bad job all the time, and so while for your mental health, I don’t think you should feel that badly, having a little terror every now and then… Statisticians can have a little terror as a treat, I think that’s just ingrained in what we do, but to answer your question, I don’t know, I don’t think there is any one starting point because I think everything that you’re saying could be really well condensed into… “I don’t care about this right now, and it doesn’t mean anything to me right now, so I can follow along with the math, but why… Why am I doing this?” And so I always recommend to everyone, including my own students their, I don’t wanna call it a Capstone, ’cause it’s just a one semester course, but their final project is their own data that they’ve chosen, and it’s a more complicated project than that, but I really do that so that they’re using their skills on something that they care about. And that’s what I would suggest for everyone, is the best way to learn is to find a problem you care about and figure out what skills you need to address that problem and build up from the basics, wherever you are, up to those skills because that’s the only way it’s going to stick.
23:38 CP: I’ve taken so many statistics courses, and I only remember the ones that I used in my research or in my consulting, and for instance, I don’t wanna insult him, but I took a time series class that was beautifully taught, but I don’t use time series models, and so I can tell you a lot of basic facts about it, but I don’t have as deep of a knowledge about it as I do on other courses, like Bayesian statistics that I do use every single day, and so I think that that applies when you’re first learning statistics is if you don’t have a problem you care about, it’s going to be really difficult to figure out what you need to know and what you should learn next, because it’s such a vast field. Even if you all start at what’s a mean, what’s a median, what’s a standard deviation, there’s so many branches, you can go off of that, and I think you might be tempted to learn a little bit about all of them and then get overwhelmed. And so having a project that means something personally to you or that you’re getting paid to do ideally, is really helpful in learning, even just like theory of statistics.
24:52 MH: It’s funny you say the time because we’re… I do a lot with digital analytics data and a lot of our listeners as well, which is at its core, it is time series, there’s always the time component, and I actually, at one point, out of curiosity, started just going to a few different… Looking at the basic curriculum for some statistics programs and noticed that the time series, like there’s an entire semester on time series, and it is not the first semester, and it’s not the second semester, and I was like, “Well, son of a… I gotta get through all the basics before I can get to… ” But what you said totally makes sense, like I remember when I clicked on time series decomposition, I thought that, I couldn’t stop talking about that and how now, look, we have some things that now start to make a little more sense that you’re doing different types of things to it. So it’s funny, I would actually say, that’s one area where I was like, “Ah, this is data I know and I can do something with it,” So it’s like almost the exact opposite.
25:51 CP: Yeah, it’s so personal to you as a learner because I tend to work with, not time series data, but like repeated measures data. So I’m often looking at mixed effects models type things, and that’s what is sticking in my brain because that’s the stuff that I care about, but I found that as I have different projects, I’m gaining expertise in different areas. One recent area that I did not expect to care about was item response theory, because I had heard of it in psychology, but it’s not that sexy of a topic for statisticians or data scientists, and I was on a project that they said, “You need to do item response theory.” And I said, “Okay.” And I learned about it. And now it’s a huge part of my dissertation. So, I feel like as your interests and as the projects you’re involved in diversify so will your expertise because I think it just takes time to find something that’s interesting that will help that information stick. Not that you can’t learn it if you don’t have a project, but I think it really gives you a leg up in terms of deeply understanding whatever you’re trying to learn.
27:06 MK: I think that’s a really good suggestion. And yeah, I’ve gotta think about a good problem now.
27:11 CP: Yeah.
27:12 TW: Well, I don’t know Michael, ’cause there’s a guy, Ken Williams, that Michael worked with him when this happened. I work with him now, but my understanding is that he did not know Python, he did not know statistics. He had an idea for using, and he was using Google Sheets, right? And then, by the time he was done, he had learned Python, he had built out some kind of machine learning. So he had completely stumbled along. And then, at that point, he was begging. He was like, “Could somebody find me an expert?” He ran really far and fast trying to solve one problem, and that is kinda one of the nice things with machine learning. If the model works, yeah, it could probably be a lot better and it could be more efficient, but he got to the point of value and then was really equipped to have a discussion with actual data scientists, and then, he just kind of rocketed off from there. I don’t know, Michael, if I’m representing his journey accurately.
28:06 MH: Yeah. No, that’s pretty accurate. He just showed up one day. And he’s like, “Hey, I’ve been working on this in my spare time all summer.” And he was like, “I had built this thing in BigQuery and pulled in TensorFlow and made this whole machine learning thing.” And I was like, “You did what now?”
28:22 MH: I was like, “This has actually got pretty cool.” So, we tried to build some work around it and then eventually got some help and tweaked the model, but I do find it interesting that a lot of people who aren’t statisticians day-to-day… Now, every time I say that, I’m gonna screw it up. I’m gonna fuck it up every time I say statistics, statisticians. Okay.
28:46 TW: That’s okay. But now we know that our listeners are totally cool to just rip you and tease you mercilessly about that.
28:53 MH: It’s all good. I don’t care. We don’t do the show for you. I don’t know if you listeners know this. This show is actually more for us. We wanted to talk to Chelsea, so we’re talking to Chelsea. Okay. So, but I took two semesters of Stats for Business People. And mostly what I learned was, I remember learning, was the back of the book looking up Z-scores. I don’t even remember what Z-scores are for, but that’s what I remember. Obviously, learned other statistical concepts, but it’s always been at this point of interaction with, “I need to figure out how to get something done.” And so, I think it’s very heartening in a way to hear you describe it that way, which is pull statistics in at the point of attack, if you will, of where you’re trying to apply some better knowledge. ‘Cause a lot of times in my own analytics work, it’s really sort of… I sort of get to the end of something and say, “It’s gotta be deeper than this. It’s gotta be better than this, or it’s gotta be better able to explain this, or there’s gotta be a better way of showing this.”
29:58 MH: And so that pushes me into like these other concepts. And then, I talk to people like Matt Gershoff who got mentioned before or other people who are really good at it, and I get my mind just is completely scrambled. And then, I get to unwind it. And eventually, I come out with usually some concept that I can kind of hold on to for a period of time after that, but it’s very nice to hear that I don’t have to, to be a professional, have some kind of big corpus of knowledge built up that I’m just ready to apply at any given time.
30:33 CP: Yeah, you will eventually. I think that’s…
30:36 MH: Yeah.
30:37 CP: But yeah, that’s actually interesting because I do think there is some sense in which you need some concepts that you understand in order to apply anything in statistics. Variance is a really good example of that, but I think that bar might be a lot lower than people think because if you’re being a careful analyst or a careful statistician, you’re gonna be asking yourself these questions that make you go back and reflect on those topics anyway. So, yeah, as long as you’re being careful, [laughter] practice careful statistics.
31:17 MH: Yeah. Well, and now what I’m gonna do is I’m gonna reach out to you, Chelsea, and hire you ’cause your rates are extremely reasonable. In fact, I would just go on record and say, “You need to raise your rates.”
31:29 CP: Yeah, I’m going to. I’m going to.
31:32 MH: As a consulting professional, I’m very unhappy with the rates that you’re at, and that’s somebody who’s probably gonna reach out to you after the show and try to hire you to help him with problems.
31:44 TW: Michael would actually like you to bill him for this so that he can then claim that he’s grandfathered in on the lower rate before you raised them.
31:50 MH: No. No. That’s not the case at all. I’m trying to help people out while also helping myself ’cause that’s the thing is I’ll often… I think a lot of people run into this where they’re like, “Yeah, I know there’s something more here, and I’m so sketchy on the knowledge. I just need somebody to tell me it’s gonna be okay.” And usually that’s… I go to Tim, and I get Tim’s help for free, but Tim also is really kind of an asshole.
32:13 MH: So…
32:14 CP: I’m very nice.
32:15 MH: Yeah, see, so that if you’re paying somebody for their time… Maybe I should pay you for your time, Tim, but your time is super expensive.
32:21 TW: You think? I think you’ve kind of tried that too, and that didn’t work either.
32:25 MH: Yeah, that’s right.
32:26 CP: Well, let me say something about that because I think I get that feedback a lot. And I am going to raise my rates eventually, but… Well, for two reasons, the admin reason that’s boring that I don’t really wanna talk about is just it’s not my full-time job. It’s just something I do for fun and experience. So, I wanted to make it accessible. But building off that, the main reason, and the one I wanna tell you about now, is that I built that consulting service as a reaction to what my friends were needing, my friends in psychology who aren’t statisticians. The other statisticians don’t need my help. [chuckle] But I would constantly get calls and texts from my friends who do their own analysis, but needed a little bit of sanity-checking or asking a question about what I thought was best, and I realized how valuable that was to early career researchers and graduate students. And the reason my rates are so low right now is because I wanted that to be very accessible to people who don’t have giant grants. So I actually do charge more if it’s someone with a grant, or it’s like a long-term contracting thing, that’s not my hourly rate, don’t get excited. But for people who really just need that little push, I want that to be accessible because graduate students don’t make very much. I know, I am one. [chuckle] So I just wanted them to be able to experience that.
33:57 MH: I’m gonna mention Matt Gershoff again ’cause he’s the one who we kind of put him on a pedestal, that makes him very uncomfortable and he loses us very quickly, but I know he also… He has a guy, a professor somewhere, that he will reach out to ’cause he runs a product company. And from your earlier statement about, yeah, even you are sometimes terrified. Am I interpreting this wrong? It seems like there’s even that piece to getting a second set of eyes like, “I know my stuff, but if I headed down this path,” even the, I pursued this problem, I found a path that was taking me forwards on it, but I may be completely missing the forest for the trees, and needing someone else who has enough knowledge and experience to say like, “Oh, yeah, I can see how once you took one misstep, you just followed the primrose path all the way into something that all logically made sense, but you kind of missed at the very beginning.”
34:54 MH: Or at the other extreme, “Hey, you’re heading completely down the right path, but did you know about this other little… Because you pursued that so directly, go check out this other thing.” Like, our Head of Data Science, I was talking about some Twitter curiosity stuff I had, and he was like, “Well, that sounds like TTR.” And I’m like, I’d never heard a text token ratio, like a text analytics, and that’s exactly what I want, but I’d never really done anything. Googled it, figured it out, and was able to play around with it, so it feels like there’s that piece too that they’re getting… Everybody’s gonna pursued their different directed things, so tapping into others. You’re gonna find people who know stuff that you have pursued things that weren’t of interest to you when they were covered in the class, I guess.
35:42 CP: Yeah, I mean, you have no idea how much I’m reaching out to my statistician friends. They don’t even need more knowledge than you, but like you said, a second pair of eyes is really useful to quell that fear that I feel all the time constantly. And yeah, I mean, to mention too that on Twitter, you mentioned Dmitri earlier, he and I are constantly messaging back and forth, quick little like, “Can you read this sentence? Does that make sense? I saw this in a book, I don’t think it’s correct. Can you double-check my intuition here?” And it’s so incredibly valuable and yeah, I think everyone needs a statistician friend, and if you don’t have one, you can hire me. But I’m happy, like on Twitter people ask me questions all the time, and I’m happy to answer those for free, because I think it is hard to find a statistician who’s willing to do that kind of small scale stuff, because most statisticians are saying, “Well, pay me a couple of thousand dollars and I’ll do your whole analysis for you.” And while that’s useful, and I’ve done that before, I think there’s something special about empowering other people to do their own analyses because they’re usually the experts, the domain experts, and you can’t do good statistics without domain expertise. So to empower them to do their own analyses is just so much richer than trying to do it yourself without any domain expertise whatsoever.
37:13 MK: So Chelsea, I just wanted to turn a corner a little bit, something that’s been on my mind. Tim has a comment here about how the language of statistics often feels foreign and then at some point it clicks, and why are we rejecting the null hypothesis rather than proving a hypothesis? But I’ve actually noticed, and I was chatting with some work colleagues about this the other day, sometimes I feel like the opposite is happening where people that have no foundation in stats are starting, and I probably shouldn’t use the word hijack ’cause it’s very loaded, they’re starting to hijack that language, and potentially really misusing it. One of my friends who’s a consultant said that they’ve just stopped using the word hypothesis ’cause it’s like, it’s become completely devoid of any meaning anymore because it’s thrown around so much. Is that something that you’re seeing in the world of stats? And how do you claim those terms back that actually really do mean something like a confidence interval?
38:12 CP: Yeah, that’s a great question. First of all, I’m predominantly Bayesian so your null hypothesis significance testing confusion is perfectly valid in my opinion. But yeah, I’m not sure if I notice people co-opting terms for themselves, but what I do notice, and maybe what you’re getting at is that people inappropriately use them all the time and confidence interval was an excellent example of that, because I don’t even think statisticians sometimes know what a confidence interval is, which is, just as a Bayesian plug why I think you should use intervals from your posterior, like credible intervals instead, because those do mean what you think they mean. But…
39:00 MH: ‘Cause priors and posteriors are so… Just so innate logical for marketers.
39:08 CP: Yeah. I mean, 100% agree. I think the fact that the math is hard in Bayesian stats is what has really prevented people from engaging with it as deeply as they have null hypothesis significance testing, but yeah, I agree, I think it’s wonderful in so many cases. But to get back to your question, I think that it’s really important to be precise about definitions and I don’t really have a solution except for to be precise when you use things and to define them well, because even though like you said, your friend says, “Oh, I’m not gonna use hypothesis anymore because people are misusing that so much,” I think that’s so valuable but I also think people are gonna run across that term all the time and I don’t know what to do except for to tell them what it really means in the context that I’m using it and just hope that they don’t ignore me. [chuckle]
40:05 TW: Yeah, I think the tricky thing for analysts is that they have that, as you said that, that terror, which means then to go in and correct the situation is like a double whammy in a business context a lot of times or can feel like it. And I wonder just maybe just the answer to the question is just well you just gotta put yourself out there and do your best and try to right the wrongs and maybe do a TikTok about it, and maybe that can help, I don’t know. But it’s interesting ’cause the lack of precision in language is pretty robust across all of analytics, and so even in statistics as a core piece, it’s pretty rampant, but I think that dovetails into then how do we as analysts not just use statistics for ourselves, but promote a better understanding of statistics for people who just really don’t care. They just want an answer. A business user is not there to embrace the uncertainty. It’s sort of like, “Do I stop this marketing campaign, do I invest more dollars here, what should I do,”
41:17 MH: But isn’t that the core… To me that’s the big helping understanding of statistics is that… And when the light bulb went on for me a couple of years ago that yeah, just tell me yes or no, is this right or wrong and a statistician is like, “Well, there is uncertainty inherent there. And I’m very… Relatively late in life, it was my oldest was applying to college when I found out that he was kind of smacked for having taken a statistics class when he was trying to go to an engineering school and they were like, “You didn’t even take a math, you didn’t take… How much are you really passionate about this you didn’t take math.” I’m like, “What are you talking about there all sorts of formulas,” but that was… At the same time I was kind of back to exploring statistics and data science and R, and I do think there’s this huge challenge that we are so used to marketers and business people, they can get data and it is historical data and it is deterministic and they see it as it’s hard data, and statistics is fundamentally about using a sample to predict the population and there is so much uncertainty in there and understanding what level of uncertainty and what type of uncertainty matters.
42:33 MH: It’s a critical, not super easy… On one hand, very simple concept, on another, very nuanced concept, and that’s the domain of statistics. And when you’re running into somebody who says, “I don’t care about that just tell me what I should do,” they’re inherently trying to say, make it black and white when it is completely gray. So I guess maybe, outside of people who are trying to learn statistics, if you have somebody call up who’s a marketer who’s run a A/B split test and says, “Hey, you’re the… Tell me what’s right,” and you look at it and say “Well, it’s a little squishy, but it’s kind of pointing in this direction,” how much do you invest in trying to educate them versus… I don’t know giving them a…
43:28 CP: That’s a great question and I face that a lot when I am trying to convince people to use Bayesian Stats as opposed to the typical Null Hypothesis Significance Testing, because in a way that’s a microcosm of what you’re talking about if they want the black and white decision-making process, even though it may not be the appropriate one to answer the questions they have, and Bayesian statistics really allows you to quantitatively measure and communicate the uncertainty you have. And so I’ve had this conversation with a lot of people and I think there is some sense in which you can’t make them care about it, but I think there’s a lot to be said of giving contextual examples of why they should care about it is probably gonna be really helpful because if you say, “Okay, my maximum likelihood estimate is blah, blah, blah, but there’s a lot of uncertainty, and here’s what you could stand to lose if we’re wrong,” is really important and I think, hopefully, if they care at least about money will appeal to them in some way, but yeah, you can lead a horse to water but you can’t make them learn about statistics.
44:47 TW: I just envisioned Mr. Ed wearing some kind of statistical thing.
44:52 CP: I hope he does.
44:54 TW: Okay, alright we have to start to wrap up unfortunately but this has been an excellent conversation, thank you so much Chelsea and one thing we love to do on the show is go around the horn and just do a last call. Something we’ve found that’s been interesting that we think might be of interest to our listeners. Chelsea, you’re our guest, would you like to share your last call?
45:16 CP: Yeah, absolutely. So you did say that this doesn’t have to be related to what we’re talking about so one thing I recommend all the time for people who are in any way interested in what I do is the book “The Phantom Tollbooth” because that was a huge inspiration to the way that I communicate about things, especially statistics. And it seems really weird because its a children’s book essentially, but there is a little bit of math in it, so you’ll get a little bit of math there, but I think it’s really indicative of the type of communication I wanna do that is approachable, is engaging, and is really informative in a way that sticks with people beyond the 10 minutes after they read it.
46:02 TW: Nice. That’s awesome. Thank you I like it. Yeah. Alright. Moe, do you have a last call you’d like to share?
46:08 MK: I do. Again, it’s a weird one. So, many of our listeners might know that I do struggle to find time for technical work, and trying to balance learning technical stuff, I sometimes find a challenge. So, I have been going through a really amazing journey with our internal coach, which I think I’ve mentioned before, and I wanted to share this takeaway, ’cause it’s actually really changed my life. So, previously, I used to block out three hours on a Friday afternoon to do some technical stuff. I never got there. It’s Friday afternoon. I’m always trying to finish stuff. So then, I moved it to Wednesday morning, ’cause Wednesday is my no meeting day. That also didn’t work great. And so now, I’ve tried a new technique, which my amazing internal coach has helped me try, which is I do 15 minutes every morning. And it sounds insane because I kinda was like, “Well, this is never gonna work because you can’t get into deep technical work in 15 minutes.” But the frequency, I’m actually finding really helpful. And some days I end up doing an hour or two or three, or I go back after work and do more. But I’m finding doing 15 minutes every day, Monday to Friday, is actually really helping me.
47:19 MK: And so, I’ve done two weeks straight of my goal and just wanted to share it, ’cause I know we’ve talked lots of times on the show about how do you find time to keep up with the technical side of your role if you’re in a people-lead position? So, I’m just finding that little tip is working for me right now. And he’s recommended a really good book, but I’m gonna wait until I read it. And then, I’ll let you know if I think it’s worth reading.
47:39 TW: I’ve never let that stop me before Moe, but I appreciate that.
47:45 TW: Why don’t you go next Michael?
47:46 MH: I will be happy to, Tim. Alright. So, I recently got… Somebody showed this to me and I just liked it a lot. So, apparently at Northwestern University, there is this lab called the Night Lab. I don’t really know what their whole deal is, but they do a lot of stuff online. But one thing analysts really need is a skill set in SQL nowadays, and this is a SQL murder mystery. And so basically, you go through and you get access to a little database and you can write queries, and use the queries and the schema to basically come up with your murder mystery, who you think did it. And then, you get credit basically by checking your solution at the end. And it also has a walk-through. So, if you’re not very good at SQL, you can actually go and get sort of a step-by-step tutorial through this as well. So, it’s just a pretty neat little exercise, a way to hone your SQL skills in what I thought was a really practical and cool way, which I think is sometimes such a huge barrier to entry. It’s sort of like, “Okay, I wanna learn SQL.” I sort of like statistics in a way. So like, “Okay, now, SQL! SQL!” You just don’t know where to start. So, it’s sort of like, well, if I’m trying to find out who done it, maybe that’s a good way for me to write some queries, explore database, and joins, and things like that. So…
49:12 MK: Oh my God! Helbs, this is my next team event. You’ve saved the day. That sounds great.
49:16 TW: Nice!
49:18 CP: I love… I’m gonna check that out. That’s very cool.
49:21 MH: Yeah, we’ll keep it in the show notes. Alright Tim. What about your last call?
49:25 TW: So, often during shows, I wind up wanting to call an audible and then do it two for… I’m just gonna call an audible and just switch what was gonna be my last call. And it’s gonna be a little bit of a throwback to our last episode with Elliot Morris, ’cause we kind of… We talked about Nate Silver and election forecasting a little bit. And Elliot and Nate have kind of gone back and forth. And since then, there’s been a little bit more back and forth. I think the one I was aware of at the time was, when we were recording, was a discussion on multi-level regression, and post-stratification, or MRP, which of course, all of us know. But Joe Sutherland, past guest, had forwarded me a thread ’cause he knew we’d talked to Elliot. And it was written by Andrew Gelman, who’s the Director of the Applied Statistics Center at Columbia, but he’s one of the contributors to, as I understand it, one of the contributors to the Economist political forecasting model. And he basically kind of deconstructs a back and forth on Twitter that Nate Silver and Elliot Morris had. And he wrote this post called “Thinking About Election Forecast Uncertainty”. And he basically deconstructs the tweets and sort of kind of calls out what points he thinks Nate Silver made that were fair, which ones were not.
50:43 TW: What kind of struck me as we were talking, he does get to a little bit of the Bayesian world of priors, and from an election forecast perspective, he’s like, “Well, okay, but if you just picture priors being a 50% coin flip when you know that’s not the case, wouldn’t that be a little bit ridiculous?” But it was interesting because it reminded me with Elliot, there was some discussion about… He was saying, “We feel like our model is kind of tilting a little too far in this direction and we’re trying to figure out how to tweak it.” And that apparently has continued into the public sphere. And this just blog post, very non-combative. Very, very clearly written. I think the whole statistical modeling, causal inference in social science site, which I think is Andrew Gelman, or maybe he’s one of the contributors, may become a regular part of my reading, but it was a mid-length and delightful read deconstructing a Twitter back and forth between a couple of very sharp people.
51:47 CP: Can I just… This is cheating, ’cause I already gave my recommendation.
51:51 TW: That’s okay. Twofers are allowed.
51:53 CP: If you want something that is not short, but is still delightful, Michael Betancourt on Twitter, I believe, his handle is @betanalpha, which is probably best way to find him. He wrote a pre-print/blog post called “Towards a Principled Bayesian Workflow.” I believe that’s what it’s called. I might be switching a word out there. It talks exactly about what you’re addressing, and I think is so important. And I would be remiss if I did not encourage the little bayesians out there. Very great read because he really argues that you need to be thoughtful. And if you’re putting prior information into your model that doesn’t make sense, you’re gonna have a shitty analysis. And I think it ties really well into what you’re talking about here.
52:39 MH: Awesome. Very nice. I like it. I like it a lot. Actually, I like this whole show a lot. So our show is one where we encourage feedback and we would love to hear from you, the listener. And you can easily reach us. I think the Measure Slack is probably the easiest and fastest way. Obviously on Twitter. Chelsea, you’re also very active on Twitter, and you can find her @ChelseaParlett on Twitter. That’s with two Ts. And we’ll have that in the show notes as well so you can easily find and follow her on Twitter. I’m definitely guilty of sending your Twitter feed to other people and saying, “If you’re not following Chelsea, you need to because she’s the future.”
53:18 CP: Thank you.
53:19 MH: And on a personal note, I wanna just say thank you for the persistence and the pursuit of this and the passion that you have for this. I think seeing people represented in this space in this way is something I’m extremely passionate about so I just wanna say thank you for that.
53:36 CP: Yeah, thank you. That’s very kind of you.
53:38 MH: Anyway, I also wanna say thank you to our producer, Josh, because he does such a great job helping put the show together, and we wouldn’t be able to do it without him. And so I think that no matter if you’re a Bayesian or a frequentist, if we can even go that far, I think my two co-hosts would still agree with me, Moe and Tim, whatever you do, keep analyzing.
54:05 MK: Thanks for listening and don’t forget to join the conversation on Twitter or in the Measure Slack. We welcome your comments and questions. Visit us on the web at analyticshour.io or on Twitter at Analytics Hour.
54:19 Charles Barkley: So smart guys want to fit in so they made up a term called analytic. Analytics don’t work.
54:26 Thom Hammerschmidt: Analytics. Oh, my god. What the fuck does that even mean?
54:35 MH: Sorry Moe, I cut you off.
54:36 MK: No, it’s all good. But yeah, and then I’m just…
54:38 TW: We had high hopes for the mematic future of statistical pedagogy.
54:46 MK: Oh.
54:46 MH: Okay, start again Moe and then I’ll cut you off.
54:49 MK: Sorry, yeah.
54:50 CP: And so I made a website and I was like, “Yeah, this is what I do now.”
54:54 TW: But did you make it in R though because that’s what Tim would like to know?
54:58 CP: I did, yeah.
55:00 MK: Oh. Of course you did.
55:00 MH: So, sorry Moe, let me update the score over there.
55:05 MK: Wait, I’m confused why. I noticed that comment about Moe’s gonna be annoyed that we have another R stats person. I’m team R, or is it just the fact that I keep saying Python’s gonna win?
55:18 MH: Moe, you realize we record this podcast, right? So we have you saying, “Okay, I am going to declare it, the war is over and Python has won.”
55:26 MK: Python has won. It has won, but I’m still team R.
55:29 CP: Has it won though? I feel like…
55:32 TW: Hold up, time out, time out. Time out! All the graphics we get going in the show hasn’t even started yet? I’m still working on this bio, people.
55:41 CP: I mean, I have strong opinions about that, so.
55:49 MH: Rock Flag and hashtag Bayesian TikTok.