Listen to the podcast here:
Every week, we talk about important data and analytics topics with data science leaders from around the world on Facebook Live. You can subscribe to the DataTalk podcast on iTunes, Google Play, Stitcher, SoundCloud and Spotify.
In our weekly #DataTalk, we’re talked with Kristen Kehrer , Senior Data Scientist at Constant Contact, about ways to get your very first data science job.
This data science video series is part of Experian’s effort to help people understand how data-powered decisions can help organizations develop innovative solutions.
To keep up with upcoming events, join our Data Science Community on Facebook or check out the archive of recent data science videos. To suggest future data science topics or guests, please contact Mike Delgado.
Here’s a full transcript:
Mike Delgado: Hello and welcome to Experian’s weekly DataTalk, a show where we talk to data science leaders from around the world. Today, we’re actually talking about a very, very important topic. In fact, Kristen, this is one of those topics that we get in our data science community on Facebook all the time. People are wondering, “How do I get started working in data science?”, and it’s probably the most prevalent question, the most asked question, so I’m super excited to have you as our guest. Folks, we’re talking to Kristen Kehrer.
She’s a Senior Data Scientist at Constant Contact. She got her B.S. in mathematics from Dartmouth, and then she went on to earn her Masters of Science in Statistics. She has been working in the field for a long time, very active on LinkedIn, answering people’s questions. It’s an honor to have you, Kristen as our guest today. Thank you so much.
Kristen Kehrer: Thank you so much. I’m so excited to be here.
Mike Delgado: Kristen, can you share with us your journey that led you to begin working as a data scientist?
Kristen Kehrer: Yeah. I mean, it’s probably a long journey because I finished my bachelor’s degree in 2004, which was a little before the term was coined ‘Data Scientist’, and I didn’t know that I was going to go in to data science. I knew that I didn’t want to be a teacher.
I tried a couple of different things, and I found myself in a role that was, didn’t have a ton of job opportunity for me in terms of growth, and so we’re still not at the point where data science is really a term, but I had seen that people were successful who had a background in statistics. I saw a lot of job opportunities for statisticians and people who could build models, and so I decided to go back and get my master’s degree in statistics. It was like a very interesting time too because this is like 2007, 2008, the bubble.
The housing bubble was about to burst. I’m working in doing like financing things for a real estate company, and so job security wasn’t looking so good, and it just seemed like the right time to go and hide in academia. I got my master’s degree from WPI.
The faculty there was very good about helping me to get a job after I finished, and so I originally have thought that I wanted to be a community college teacher or something, but this job came around that my professors suggested to me, they said, “Hey, do you want to go do Econometric Time Series Analysis for NSTAR?”, and, “Sure. That sounds cool”, so I was building neural nets to forecast hourly electric load, and that was used for capacity planning during heat waves. Like it was imperative that they were able to use the output of that model to determine how they were going to allot capacity and how we were going to keep everybody’s lights on.
Mike Delgado: Wow.
Kristen Kehrer: I built a lot of Time Series Analysis models using ARIMA predominantly, and that has been so valuable throughout my career. Any job that I’ve worked at afterwards has wanted to know how trends are trending over time and, “How can we forecast that, and what insights can we gain from that?”, so I highly suggest anybody who has the opportunity to learn Time Series Analysis. From there, I realized that there was a whole world of analytics, and that I was taking part in just one, small piece. Like I was building cool models, but it was one, small piece, and so I moved to the more broader analytics area, where at this time, I don’t know, maybe data science was a term, but I referred to it as ‘Advanced Analytics’, and so I was using a little bit of coding. Along the way, I picked up SQL and was building models to drive business value, and I’ve worked in a couple different industries, and now I’m at Constant Contact doing data science.
Mike Delgado: You were doing data science before it was coined the ‘Sexiest job’, right? At what point did you decide, “This is the career field for me”? Was there a certain project you were working on or something that you were doing that you’re like, “Yeah, this is something that I’m really enjoying. I’m not going to pursue academia anymore to be a professor. I want to stay in business”?
Kristen Kehrer: Yeah. I’m incredibly passionate about mathematics, and I absolutely love statistics, and as … I don’t know if it’s because I’m a woman, but I’ve always felt like I had something to prove, like I’m not going to lie. In that first job where I was building neural nets, and I also did like a mathematical proof to show why our choice of T value is greater than one in our model would reduce the forecasting error, and therefore, made it a more efficient model because we had to make these submissions to the DPU, and I’ve always felt in the roles that I’ve been in highly valued. I felt respected.
I felt that other people value my intelligence, and that was something that was important to me, was that I was able to use my brain, think about problems in a way that a lot of times, especially now more recently in my career is really out of the box. All of those things are really attractive to me, the fact that I have to really think that it’s a skill set that not everyone has and that not everyone can both build models, and easily distill that information for the business, and really advocate for ways that we could be thinking about things that could optimize a process or add value.
Mike Delgado: Today’s topic is very, very important to our data science community because there are so many people who are just graduating from college or just finishing up some certificate courses in something around data science. One of the questions that we get a lot is, “What are the most valuable skills to learn to help prepare you for a job working in data science?”
Kristen Kehrer: Yeah. I don’t think we talk about it enough, but SQL is … If you have finished a degree in a quantitative field or you’ve taken some MOOCs, and you haven’t yet learned SQL, you need it. It is a non-starter because the majority of the models that you’re going to build or the analysis that you’re going to do is going to be on data that’s probably living in a data warehouse. You may be working with big data technologies, but even as an example, Hive is also a structured query language, so it’s applicable there too.
In the real world, we don’t just get data sets that are handed to us that are clean. In the real world, we are joining across different tables in a data base to structure the data the way that we need it to be for analysis, and so if you don’t have that ability to get in and self-serve and get that data yourself, you’re going to be really hindered.
Mike Delgado: Yeah. I don’t think I’ve heard a lot of … I mean, SQL has been mentioned in previous broadcasts, but a lot of times, people are focused on talking about programming languages like R and Python, and so I’m really glad you’re talking about SQL as being one of the most important languages to learn to help you with their structured data.
Kristen Kehrer: Yeah. Yeah. I mean, because I of course, I’m using R and Python every day because I have a problem where I can’t make a choice.
All my models were built-in R, up until probably six months ago, and then I started making the move to Python, and so now, I’ve been doing a lot of coding in Python, but I still leverage R every once in a while because I just can’t completely let go of it. Sometimes I also use rpy2 to call R through Python, but I think for being able to walk into a job when it is your day one, like, “What are you going to be doing?”, they’re going to say, “Here’s our database, and this is what you’re going to be using to get the data to make your models.”
I think as much as we like to talk about cool technology and we certainly can do a lot of cool things, like SQL, man. SQL.
Mike Delgado: All right. You heard it from Kristen. “SQL, man.” I got to quote you on that. It’s all about SQL, man.
No, that’s good. I think that’s really valuable advice for our community because like you said, day one, “What are you going to be doing? Here’s our datasets.” You got to start playing around with it. You need to know SQL, and I don’t think that is talked a lot — and definitely something we don’t talk a lot about here in DataTalk, so thank you so much for sharing that.
Recruiters get really inundated with applications. I’m on LinkedIn, and pretty active there, so whenever there’s jobs like even here at Experian for data science roles, I’ll get a lot of messages on LinkedIn, “Hey, can you share my profile with a recruiter?” I’ll talk to recruiters, and they’ll say they’ll get like 300 resumes in a week for a data science role.
What advice would you have for somebody who is brand new, just starting out, they don’t have any work experience to speak of? How can they standout in this pile of resumes or LinkedIn profiles that recruiters are looking at, but what can help them?
Kristen Kehrer: Yeah. I think there’s definitely ways that you can leverage LinkedIn in terms of I don’t think people think enough about … Like you can go and connect with the recruiter and comment on his stuff, and not just when he or she has a job posting available. Like proactively go and engage in conversation, and add value for that person so that when you do go and you sent him or her a message on LinkedIn, they know who you are. You’re not just another random person who’s messaging him.
You’re the girl that has been talking in his comments and he’s interactive with you. Then, the way that you position yourself on LinkedIn when you’re reaching out to these people is you really want to make sure that you’re not talking about what it is that you’re looking for.
You want to talk about how you believe that you fit that role, so if you reach out and you say … I’m trying to think of how I say it because I do … On my last job hunt, I absolutely was leveraging LinkedIn, but you reach out and you say, “Hey, I’ve noticed this position. I’m a person who is skilled in Python, SQL”, and mention some different types of modeling that you’ve done, and it can be from your coursework.
You don’t have to say it’s from your coursework, but you want to make sure that you get a response and that you get noticed, right? We just want to position ourselves differently, showing that you can, because if you can write some Python, and you can write some R, and you can do some SQL, and you can do some modeling, you don’t need to say like, “Oh, I learned this in school.” You can just say, “Hey, I’m so and so. I saw this position open, and I’d love the opportunity to speak with the correct person. I’m looking to get my resume in the right hands, and I have experience with Python, SQL, Data Analysis, and building models, and I hope you’ll forward my resume to the appropriate person.”
It’s not going to work all the time, but I think that there’s some thing that you can be aware of and really think strategically when you are going to reach out to these people, and if you can try and build a relationship before reaching out, that’s all the better.
Mike Delgado: Where do you … I’ve seen some different comments and blogs, people talking about the value of posting your portfolio on GitHub, but I’ve also read some blog articles about GitHub’s point list for putting your portfolio. Kind of curious about your thoughts on where to put your portfolio work as a data scientist.
Kristen Kehrer: Yeah. I mean, I have an enterprise GitHub, so if you were to check my personal GitHub, I don’t even have my projects there, however, I do have a blog, and so for anyone who’s looking to really highlight their skills and abilities, if you want to go ahead and do it, start a blog. It is incredible for … Each time you write an article, post it on LinkedIn, post it on Twitter. Get people’s eyes on your blog.
Don’t just post it and leave it there, but share that article on LinkedIn, and so not only because GitHub’s for code, and some comments, and a README, but demonstrating that you were able to do these technical things and that you were able to talk to them in a way that other people can read is a highly valued skill. I only started my blog in March, and it has opened up a ton of opportunities for me. Like I can’t even speak to it enough.
Mike Delgado: That’s awesome to hear. We just actually got a question here on Facebook Live from a software engineer. He’s asking … This software engineer is looking to break into the data science world, and, “What do you suggest for professionals like us?”
Kristen Kehrer: Yeah. I get that question a lot, especially from people who have just finished a C.S. degree, and they want to get a job in computer science. My thought is always to take a C.S. job because they’re no joke. They pay well. You can find fantastic jobs in computer engineering.
It’s not talked about as much as like they talk about data science as the hot topic, but people are clawing to get at developers, and as you’re working as a developer, you can take some MOOCs at night to get that machine learning piece. Then, when you go to position things on your resume, there are certain things that you’re already doing that are of huge value to a business if they are looking for a data scientist, things like automating processes, like you’re just … You’re able to take MOOCs at night and learn a couple skills, and put them on your resume, and then just start applying and working on the way that you’re marketing yourself because the marketing yourself for a data science position is like a huge piece of it, but yeah.
I think with a C.S. background, you’re in a fantastic spot to hop into the field. My boss was from a C.S. background.
I know a number of people from C.S. backgrounds who make the move, and they’re … Having the C.S. first is, kind of gives you a leg up.
Mike Delgado: You just mentioned about just the importance of continuing to learn, and I was kind of curious. I do see lots of posts on LinkedIn about different boot camps, data science boot camps that are out there, courses on Udemy, Coursera. I’m kind of curious about your view of those different types of courses and how you view the certifications of those courses. Are those valuable?
Kristen Kehrer: I don’t necessarily see the certification, like the paper itself as valuable. Certainly, add it to your LinkedIn, because why not? Maybe don’t add it to your resume because that is prime real estate that you need to really think strategically. You only have one page. I don’t like it when I see two-page resumes, but in terms of the courses themselves, yeah, highly valuable.
I actually have asked on LinkedIn a while ago for people to suggest their favorite ones, and there’s a Python A to Z course that everyone recommends, so you can see the courses that other people recommend, write social proof, and try and take ones that are good because some of them … Not all courses are created equal, but I’ve personally taken a MOOC on Git. I’ve taken a MOOC on Python, and when I decided to make the hop from R to Python, I started with the MOOC. I also used Codewars, which is free, but I was able to learn just a little bit about web scraping, about writing to a database. Now, I had been using databases for years, and I had helped move data that was not in a great schema, or there was just like a bunch of snowflake tables.
I had helped in the transition of how to structure that data as we moved over to a star schema, but I had never actually written to a database. I feel like I pick up little nuggets of awesomeness everywhere, and even this far in my career, I’m still always every month or two taking a new MOOC to just … I don’t know. I just find them fascinating. I love that after my kids go to sleep, I can watch some video, and learn something new, but that’s just part of my personality.
Mike Delgado: Is there certain online classrooms you recommend over others because like I said, there are so many boot camps that are out there, and I was just kind of curious when you did your question on LinkedIn to your community, and you just mentioned one of those courses? Was that on Coursera? Was it on Udemy? Do you happen to remember?
Kristen Kehrer: The course that I took, Python For Everybody is on Coursera. Most of the courses that I have taken around Coursera, because it always to me, I liked the fact that it was coming from an accredited university, but at the same time, I’ve heard other people’s opinions on other courses as well, and I can’t say that I know all there is about sort of the landscape and who is out there. I just know that they are valuable.
I actually thought about taking a boot camp the last time I was switching jobs. I was thinking about becoming a full-stack developer, but of course, I stayed in data science because that’s where my heart is, but just the idea of taking a couple weeks off of the job search and learning something new just sounded like so much fun, but yeah, definitely. Sorry. I can’t help anymore with helping to narrow down the most useful platforms, but I have always had luck with Coursera, and it’s worked for me.
Mike Delgado: Nice. I got another question here on Facebook Live. “Hey, Kristen, I want to know what are some of the tools that I can learn and practice SQL and T-SQL from?”
Kristen Kehrer: Yeah. The first option is always to take a MOOC. Secondly, I think a lot of people don’t realize that SQLite is open source, and so you can download that. You can find different data sources, whether you go into Kaggle. Some of those are a lot of times really large.
I’m trying to think of … I had seen a really great data website before they aggregated a ton of … Kyle McKiou on LinkedIn. He has an article that he lists like 20 free online courses or something, or 20 data science resources, and number 20 was a website that had a ton of free data, that it was just there. It was like this huge site that this person had aggregated a bunch of data resources, but yeah.
I mean, you can set up SQLite. You can download data, and now, you are playing in there for free, and of course there’s YouTube tutorials if you didn’t want to actually take a MOOC, but there are absolutely MOOCs out there for specifically learning SQL and T-SQL.
Mike Delgado: Awesome. What are some common interview questions that a new data scientist should be prepared to answer when going in for a job interview?
Kristen Kehrer: Absolutely. I mean, so when you get the phone screen and you pick up the phone, they’re going to say, “Hi, Kristen. Is this still a good time to talk?”, and you say, “Yes.” Then, they’re going to say, “Okay. Great.”
“I have your resume in front of me. Can you please tell me a little bit about yourself?” Here, you don’t want to give away the farm. You’re just looking to show that you can explain who you are in a concise sort of way. I typically go with something along the lines of, “I’m a data scientist with eight years of experience, working across healthcare, the utility industry, and I’m currently in E-commerce, and I have a master’s degree in statistics, and a ton of experience building different models that I’d like to tell you about.”
It’s just like three lines. That was sort of off the cuff. I haven’t interviewed in over six months, so not my best work, but you get the idea.
Mike Delgado: Yeah. Yeah. Yeah.
Kristen Kehrer: Then-
Mike Delgado: It’s like your elevator pitch. I like that.
Kristen Kehrer: Yeah. Yeah. Yeah. You have to have the elevator pitch ready. Then, the other questions that you … I mean, there’s really four specific ones, and I’m sure I’ll only be able to remember three, but, “Tell me about a time with a difficult stakeholder and how it was resolved.”
“Tell me about a time that you explained technical material to a non-technical audience.” Yeah. I can’t remember the other two, but they’re in my one of my blog posts. I was asked these same questions over and over again, and so those are the behavioral questions that you’re supposed to answer in the STAR format, so you clearly give sort of some context and background, talk about the problem, talk about the solution, and the results, and then of course, end it with, and that is a time that I worked with a difficult stakeholder.
Mike Delgado: I’m going to make sure to, after this episode is over to link over to that blog post that you just mentioned with those four questions. For those listening to the podcast, the URL is ex.pn/datatalk55, and that is a place where … I’ll go ahead and put the links to Kristen’s article for those that are interested in those behavioral questions, those four key questions that get asked quite a bit. I think those will be really valuable to go over.
I think the one that you’ve mentioned that I thought is really interesting is about the communication aspect, communicating something that’s very technical to somebody who’s not familiar with that jargon. Can you talk about how important that communication is for the data scientist?
Kristen Kehrer: Oh my God. It’s so incredibly important. I pretty much promise you that if you have four interviews with four different companies, you will be asked this question at least once, if not, more. I think that when we talk about data science, a lot of times, if somebody said to you, “What’s a data scientist?”, a lot of people are going to say, “It’s somebody who writes code, maybe production level code. It’s somebody who builds models. It’s somebody who does analysis”, but the big pillar of that is also business acumen.
I see data science as a very cross-functional, interdisciplinary field where I am routinely working across all sorts of departments to understand their needs so that those can be inputs into my model because if I build out this beautiful cluster analysis, but I haven’t talked to other areas of the business, it may not be something that they’d want or need, so you’re getting buy-in first across the organization. Having that buy-in is what allows you to have value. Then, at the end, after I build a model, I’m always presenting that model afterwards. I’m not talking about the Fourier transforms that were used to determine whether or not a customer was seasonal.
I’m talking about what percentage of our customer base was identified to have seasonal patterns, and what do these seasonal customers look like, and so when I answer the technical question myself on the interview, I typically start with, I give this example from when I was working at Vistaprint.
At Vistaprint, I was asked to do a behavioral cluster analysis of our digital customer base, but when it came time to talk to the stakeholders, I brought it up a level, and so I’m clearly telling the person that I’m talking to that I was bringing it up a level. I was like, “I brought it up a level.” I wasn’t talking about the methodology. I was talking about the size of the opportunity and the behaviors that each of those clusters had, so I wasn’t talking about hierarchical clustering or PAM. I was talking about we have this group of high spenders who are very highly engaged.
They’re utilizing all of the resources at their disposal to get in contact with us if they need help and other things, and talking about the other groups. That’s the picture that I paint during the interview for this person when I’m asked, “How did I explain that technical concept to a non-technical person?” The answer is, “I brought it up a level, and I talked about the opportunity, yada, yada, yada.”
Mike Delgado: I love that. When you’ve got … I mean, that’s a huge skill to talk about something very, very technical, and then bringing it up to a level where what you’re saying is going to be vitally important to the business leaders, because they don’t care necessarily about the model or the algorithms you’re working with.
Kristen Kehrer: No.
Mike Delgado: They care about the insights and how it’s going to help grow their business, and so to be able to talk about levels is going to be crucial for your success as a data scientist.
Kristen Kehrer: Yup.
Mike Delgado: Have you ever, while presenting had to deal with pushback from leaders, like they didn’t agree with your results? Did that ever happen?
Kristen Kehrer: I’m trying to think. I mean, there’s always questions and feedback, especially when you are presenting to senior leadership and you’re a data scientist, and you report up through maybe marketing, maybe a different department or whatever, but if you are presenting to an area of the business and senior leadership that’s sort of outside of your wheelhouse, they’re absolutely going to have questions about things that you maybe haven’t thought of.
Mike Delgado: Yeah.
Kristen Kehrer: That comes back to getting that widespread buy-in that’ll help to mitigate some of that proactively, but absolutely, people are going to have questions, and they’re going to challenge you. That doesn’t mean that what you did was wrong. It’s an opportunity to look at your analysis another way and maybe improve it, or maybe add an extra dimension to it to help fully explain the story.
Mike Delgado: Cool. I know our time is up. I have just one last question. I actually found this question on Quora from a bunch of people who were wondering. The question was, “What do experienced data scientists know that beginner data scientists don’t know?”
Kristen Kehrer: So much, because I speak to a lot of people, and when I’ve hired in the past, yeah, you just don’t realize how much you know until you have to break it down, because really, especially since we were already talking about presentations, I think that that is one of the big pain points, is that most people out of school who start their job in data science, they don’t know how to write a deck effectively.
It’s not one of those things that was covered in school. Maybe if their background is in business, maybe you got it, but if you came out of a statistics or a C.S. background, I bet the farm that your first [inaudible 00:33:40] isn’t going to be pretty, and it’s that same idea.
It’s that we’re not … Like in grad school, we spend a lot of slides talking about the methodology that we used because that was correct for that audience, and now in business, the audience has changed, and our presentations need to be structured to reflect that, so starting with a high-level overview, talking about only the insights that are important, and not necessarily the details that went into it, and finishing with a summary and talking about your next steps, and including nice visuals, and maybe using a branded template, and the verbiage, but yeah I’ve seen some really bad presentations, and if I was going to pick one thing, because people will pick up the technical things, and maybe you haven’t had the opportunity to build the type of model that is the best solution for the project at hand, but there’s also opportunities where I’m going to be leveraging new methodology, and I’m going to research that even though I’m nine years in. That’s something that’s constantly growing and evolving, but yeah, presentation skills.
Mike Delgado: This has been a wonderful discussion, Kristen. Super valuable to me, to our community because there are so many people who are trying to get their foot in the door to become a data scientist, and all of the things you’ve shared in today’s episode is just super valuable. I want to let everybody know that Kristen’s available on LinkedIn. Kristen, where can people follow you, reach out to you? Where is the best spot that maybe you’d like them to go?
Kristen Kehrer: Yeah. I have a blog. It is Data Moves Me, and so I’m pretty active on my blog. Also, you can find me on LinkedIn, and I’d say that that’s probably where I’m the most active.
Mike Delgado: Okay. Is that Datamovesme.com?
Kristen Kehrer: Yes.
Mike Delgado: Okay. Perfect. I’ll just put that URL on the screen. Check Kristen out. Engage with her posts there. Like I said, I’m going to be putting a link to her LinkedIn profile on the Experian blog.
Again, the URL is just ex.pn/datatalk55, and we’ll have a link to her LinkedIn profile to her website, as well as her article that she wrote about the common questions that you’ll get when you’re interviewing to become a data scientist. Kristen, thank you so much for your time today. We actually got tons of questions in the queue. Unfortunately, we couldn’t get through them all, so if you do have questions for Kristen, please go to her blog. Post them there, reach out to her, and that will work with her, and follow her definitely. Kristen, thank you so much for your time today.
Kristen Kehrer: All righty. Thank you so much for having me. I really appreciate it.
Mike Delgado: Okay. Take care.
Kristen Kehrer: Okay. You too.
Kristen Kehrer is a Senior Data Scientist at Constant Contact. She earned her Master of Science in Statistics from Worcester Polytechnic Institute and Bachelor of Science degree in Mathematics from the University of Massachusetts, Dartmouth.
Check out our upcoming data science live video chats.