How to Hire a Data Scientist: Key Skills & Personality Traits @ExperianDataLab (Episode 23) #DataTalk

Listen to the podcast:

Every week, we talk about important data and analytics topics with data science leaders from around the world on Facebook Live.  You can subscribe to the DataTalk podcast on iTunes, Google PlayStitcherSoundCloud and Spotify.

This data science video and podcast series is part of Experian’s effort to help people understand how data-powered decisions can help organizations develop innovative solutions and drive more business.

To keep up with upcoming events, join our Data Science Community on Facebook or check out the archive of recent data science videos. To suggest future data science topics or guests, please contact Mike Delgado.

In this #DataTalk, we talked with Luca Zuccoli, Experian’s Head of Analytics and Data Lab in Asia Pacific, about how he hires data scientists and what skills and personality types are important. To suggest future data science topics or guests, please contact Mike Delgado.

Here’s a full transcript:

Mike Delgado: Hello, and welcome to Experian’s weekly #DataTalk, a show featuring data science leaders from around the world. Today, we’re excited to talk with Luca Zuccoli, and we’re going to talk about how to leverage AI for business impact. Luca is the head of data labs for Experian’s Asia Pacific regions. He has his master’s in statistics from Carnegie Mellon University and an MBA from INSEAD.

Just to let you guys know, this broadcast — Luca is in Singapore, I’m in Costa Mesa, California. We couldn’t get the audio working, so we’re using WhatsApp for audio and we’re using BeLive for the video portion, but it’s all synced up perfectly.
And for those who don’t know, this is our second attempt. We did a full #DataTalk two months ago that was awesome. I enjoyed meeting Luca. It was a great broadcast, but the audio quality was so bad. So this is part two, and we’re grateful to have Luca as our guest today. How’s it going, Luca?

Luca Zuccoli: It’s going great. Thanks, Mike. So this is part two, the revenge.

Mike Delgado: Exactly. Luca, can you share with us your journey? What led you to start working in data science? Because even the term “data scientist” hasn’t been around that long. Share with us your path.

Luca Zuccoli: My story starts way before we had cool names for our title, so I was not a data scientist. You know what? Even worse, I started as a statistician and a researcher. I studied economics, then I studied statistics. I went to Carnegie Mellon, and I’ve done research over in Italy.

Once I was there, I realized that what got me excited was managing to apply my knowledge to a real-world problem. It’s not that you don’t apply to real-world problems if you’re in academia, but typically the output of your research is a white paper. The fantastic thing that I discovered, and I’m still grateful to that extent when I was in the U.S., was that the output of my knowledge could be having a real impact on people’s lives. I’ve done things with our [inaudible 00:02:34] in CMU, all to improve the different teaching techniques. That excited me.

Then I started a career not as a pure statistician modeler, but as a hybrid. Throughout my life, to have a bigger impact, I focused on two things, not only one — which is to say the modeling and technical aspect, but also the human side. Both in terms of understanding the consequences of an improvement, some modeling or some technique and also to communicate better. Hence, my MBA. That’s where I’m coming from. Not only did I not start with a cool title, like data scientist, but I can add on top of that that I would define myself as a hybrid, which sounds ominous. I bridge the two worlds of the business world of normal humans, if we want to go [inaudible 00:03:35] as a quote, and the world of data scientists and statisticians.

Mike Delgado: You’re head of our data lab in Asia Pacific. Tell me a little bit about the development of your team. The hiring process, what you were looking for, because a lot of people in our data scientist community are grad students or they’re currently studying computer science or stats. They’re looking for their first job in data science, and they’re always curious from a data science leader like yourself what you are looking for. Can you share with our community when you’re interviewing somebody, what are key things you’re interested in learning about them?

Luca Zuccoli: Absolutely. Let’s start with what is standard and obvious. There is a certain background that is needed. Although one of the things that I’m very interested in these days … Let’s make a distinction between data scientists and statisticians. A data scientist is, in my view, a combination of knowing some methodologies but also having some expertise about infrastructure, because the two things are linked together. Back in my time, if I had to use SaaS, pretty much any machine will do. That was not a problem. It was not the problem of how to scale up an algorithm that has to work in high volume. So now there is this combination of technical and methodology. That’s a data scientist in my view.

I like to find people who are not only data scientists in terms of background and in terms of expertise, but also statisticians. Statistician means they have some theoretical knowledge. I think it gives you an advantage. No matter what the progress and the advances that we will have in terms of creating new algorithms that are based on computational capability, the ability to understand the underlying theoretical foundation is extremely important.

This is from the technical side, but equally important, and I consider this a necessary condition — what allows me to say, “This is a candidate for data lab. I want to work with this person.” It is the personality.

Surprisingly, what I’m looking for is not necessarily people who look like they are super geniuses. You never understand if someone is a super genius until you work with them. I don’t believe in flamboyant types who are spouting things. Also, because more often than not, they don’t look as smart as they think. But I believe in gritty people. Maybe because I’m like that as well. But you give them a challenge, they will not let up. So back to this concept of what data science is. Data science is about trying lots of things, it’s about spending a lot of time, it’s about sometimes grinding it through. It’s not just about, “I had a blinding flash of illumination, of enlightenment, and here we go. New idea.”

So grittiness. Commitment. Extremely important.

Second thing is curiosity. These are two things that typically are related to intelligence, but they are not the same thing. Curiosity. You know the curse of extremely smart people is sometimes they’re not challenged because throughout their academic career, they are always at the very top of their class. Until the challenge gets serious and you meet really smart people and so they don’t apply themselves. These are people who will not work in data lab and for me. The reason is very simple.

Data lab is about being cutting-edge, not now, always, and to be cutting-edge always you have to be willing to read on your own, you have to be curious, you have to be motivated to always ask yourself, “What else do I not know?” And you have to be comfortable with being uncomfortable, to look into what you don’t know. Then you have to have the perseverance to say, “I’m going to learn about this.” And it’s not because it’s a duty. It’s because it’s a passion.

This is what I’m looking for. I must say that thanks to our colleagues in North America, we actually merited a very interesting test to test to a certain extent the commitment and the grittiness and the skill of potential candidates. I used a variant, so I added some questions to it, but I think it’s an awesome idea, and I have to thank our colleagues. If I may digress one minute, I can tell you what we do.

Mike Delgado: Go for it, Luca.

Luca Zuccoli: We give them a data set and a relatively open-ended set of questions. It’s up to them. If I’m a superficial person, I will do the quick and dirty thing. I can hear someone telling me, “Oh, but you give a test, you give it at home so they could cheat, someone who’s very smart could do it for them.” I mean, come on guys, seriously?

Mike Delgado: Yeah.

Luca Zuccoli: First of all, you will present to me and my team, and we’re all experts, for two hours. In two hours, I can assure you, if you’re not the one who’s doing it … Plus this test, this analysis, all the analysis can be done in five or six hours, or you can spend three, four days, or even more.

Mike Delgado: Wow.

Luca Zuccoli: Good luck finding an expert data scientist who’s going to spend three or four days to make you look good in a job interview. But aside from that, that has never happened. I interviewed hundreds of people. That has never happened.
But aside from that, what this gives me is also a very good gauge on how interested you are in this type of job. You’re not going to spend three days, four days if you are not committed to this type of position, and I will describe the position.
This is, by the way, the second round. The first round, I will tell you what we’re all about. So if you are excited, you will spend those three, four days. I have people who spend more than 30 hours on this task.

Mike Delgado: Wow.

Luca Zuccoli: So that’s a sign of commitment …

Mike Delgado: Yeah.

Luca Zuccoli: … and curiosity. And that’s exactly what I’m looking for.

Mike Delgado: No doubt. I love that. So somebody who’s spending 30 hours on a project like that is obviously showing grit, it’s showing curiosity, it’s showing drive and motivation, which are all plus signs. So let’s say that the person does all that, they spend 30, 40 hours on this project because they want to get it right, but then they miss something.

Luca Zuccoli: Yeah.

Mike Delgado: … But they’ve shown they’re gritty.

Luca Zuccoli: Yes.

Mike Delgado: So how does that play out?

Luca Zuccoli: I try not to get people involved at this stage unless there is a very high probability.

Mike Delgado: OK.

Luca Zuccoli: The first round, we are [inaudible 00:11:14] about methodology. But to be honest, let’s be clear about what we are testing when we’re interviewing a person. Mentally, you’re testing communication skills more than anything else. If I don’t give you a task, you can read a few articles about a few methodologies and if you are a good communicator, you can be convincing. Up to a point. You don’t fool me 100 percent.

Typically, that’s not the case. But it is true that, let’s say that you are an introvert, and there are lots of people like that, and I do hire introverts, as well. That is no problem at all. Then you’re a bit emotional in the interview. So maybe you’re not so good, but I understood that actually you got it. That’s the part where if I think there is something in there, then I will give you a big task. I do explain very clearly what this is all about. I’m completely fine with people backing off and saying, “You know what? I’m actually not going to use that much time.” Or what I’ve found is that when people want to use this position as a filler — let’s say that I prefer another offer but I want to have a second one to negotiate better — then they will do the five-, six-hour version that’s very simple. It’s their choice. It’s completely fair what you’re saying.

First of all, I’m not giving this task when I don’t think there’s a more than fair shot, like 70 percent, and second, what we look for during this two-hour interview is not a perfect score.

To be honest, it is an open-ended task, so there is no perfect answer, right answer. We look for people who are reasoning and are willing to question their results. There are many kinds of angles in which you can look at this problem. Sometimes I was the first person to say, “We’ve never looked at this this way. Are you sure? Have you checked this and that?” So we are much more interested. This is not a multiple choice test at the end of which you have, “You scored 93 out of 100.” It’s much more of an open dialogue, it’s much more of a “Let’s interpret.” Which is really what we do in our job, right? We question, we push …

Mike Delgado: Yeah.

Luca Zuccoli: We try to see whether there’s more to be done.
Also, I think the great thing about this test, at least the way we administer it, is that you get a very clear sense what it’s like to work with my team.

Let’s say that you have a Ph.D., you are an expert in deep learning. It’s not that you come in and you say, “Okay, guys, I’ve got the solution.” Let’s bow to that. “Oh, it’s great.” We criticize each other, in a constructive way, but we work on each other. Have you heard about this concept of … what is it called, not parallel thinking … An innovation from the mental, the idea is that you should not do this, obviously, never, negative criticism, but on top of constructive criticism is actually even better to build upon each other’s ideas. That’s the key point here.

I’ve never seen this angle in … I’m quoting about what happened in my last interview of this type. A guy picked up the data and looked at that from a completely different angle than anyone else after 100-plus interviews. So that’s pretty amazing.

Mike Delgado: Wow.

Luca Zuccoli: Which doesn’t mean that he was right. And I’m still not 100 percent sure whether he was right, to be completely honest.

But what I was very interested in is the way he reasoned about it. The way he was open to also plug in his own theory, and that’s something important. If you wish, on top of curiosity and gritty, a certain degree of self-awareness, if not humility, because in that way you can find out more.

Mike Delgado: Yeah. You were talking about curiosity, grittiness, all these soft skills, but at the end of the day, humility has got to be there.

Luca Zuccoli: Oh, yeah.

Mike Delgado: Because if you’re being challenged, right?

Luca Zuccoli: Yes, absolutely. Humility doesn’t mean, “Yes, boss.” I’m not that type of person. I don’t hire those type of people either. One of our best data scientists is the person who contradicts me the most. Says, “You know, Luca, maybe, but you haven’t been around this methodology long enough.” No, I’m just kidding. He never calls me on the fact that I don’t do the model with them. He explained to me sometimes why he thinks he’s right. Sometimes he is, sometimes I am. That’s what is important.

These are the two concepts, in my view, of a flat organization. I am not a believer when people are telling me, “This is a flat organization.” What does that mean? Is it that everyone has got the same responsibility and the same say? No. An organization in which everyone can decide on everything just doesn’t work. What I believe in is you have to feel empowered to say, “My opinion is very valuable. Let’s argue about that.” It doesn’t matter whether you are reporting to me or not.

At a certain point, we decide that we’re going in a certain direction. Otherwise, it’s chaos. But we have to have these spaces in which I feel completely comfortable in saying, “You know what? Why don’t we go in this direction. The next three months of work will be in this direction, not in this other.” That’s the way we create real [inaudible 00:17:10] and many times, as I said, it’s a 50/50. Actually, I’m not even sure it’s 50 percent that I’m the one showing the direction. I see myself as more the orchestrator than the main expert in my group.

Mike Delgado: I love that. I love also the way that you defined humility and what that means. That it’s not, “Yes, boss.” You’re not looking for that.

Luca Zuccoli: No.

Mike Delgado: You’re looking for somebody who will challenge ideas in a kind and thoughtful way.

Luca Zuccoli: Absolutely.

Mike Delgado: I love that fact that you concentrated on these soft skills, because the hard skills, like the stats background or the background in computer science, the maths, that’s all domain knowledge and you’ve got to have that for the role.

Luca Zuccoli: That’s right.

Mike Delgado: But equally as important, like you’re saying, is you’ve got to have the personality that’s going to fit in with the group, that’s going to be gritty, that’s going to be curious, that’s going to challenge ideas, but also have the humility to back down and know when to back down.

Luca Zuccoli: That’s right.

Mike Delgado: Yeah.

Luca Zuccoli: The thing, Mike, it’s not only … To me there’s a very important consequence if you put together people of this type. Then not only you have individuals who are good individuals for a learning organization, but that’s how you create a learning culture. Learning culture means innovation.

There are some other factors which make innovation effective, but the fact that you create a machine, a group of people, who together learn faster and better than individually. That’s my definition of a learning organization. And that requires some degree of challenge, lots of mutual respect, that’s for sure. Empowerment and this kind of mindset in general.

Mike Delgado: So somebody is sharing an idea, a thought on a way to solve a business problem, and part of the way your lab works and part of the way the culture is is to challenge so that you can come up with an even better idea potentially. “Yeah, that could work, but what about if we did this?” Can you share an example of how that plays out and also where the person who’s sharing the idea originally doesn’t have hurt feelings? You know what I mean?

Luca Zuccoli: Yes. I can tell you one thing. Different groups who are doing R&D, I start with my [inaudible 00:20:13], I open — If I’m raving too long, cut me off.

Mike Delgado: No. Go for it, Luca.

Luca Zuccoli: The way I define my [inaudible 00:20:23] for data lab is that I don’t do basic research, or at least not as often now. So what is the difference between basic research and what I consider commercial R&D? Basic research is something that you do for mankind or, if you wish, in the past I would say [inaudible 00:20:42], used to do lots of basic research. The motion from Google, they don’t have necessarily a direct impact on your business. With our commercial R&D, things are definitely projected toward the future, but I want to have a result within one year or two years. Maximum.

That means that — There is lots of richness that comes in posing yourself constraints. That’s how we work well creating and interacting with the business line, which is my first rule. What I’m trying to do is something that works in the real world, that has an impact on the real world. As a consequence, number one I constantly have connection and feedback and discuss ideas with people outside of my group. I start with the most atypical group of people. I talk a lot with salespeople.

I’m trying to understand. Everyone is hearing the buzz around machine learning, about big data, and, honestly, sometimes clients have a bit of a weird idea of the capabilities. But with these weird ideas, sometimes you get lots of challenges. Lots of interesting things.
So talking with salespeople. Talking with business [inaudible 00:22:21]. I’ll give you an example. We created two products, a FraudNet [inaudible 00:22:26] and now something that we call Telcos coding, recently, and both products came from conversations with the business.

In the case of FraudNet [inaudible 00:22:37], I heard several times the other fraud saying, “We’re getting hammered here because we don’t have something that has machine learning inside of our best-in-class.” It’s a bit of a buzz, but clients want it. The first conversation was one year and a half ago and then eventually culminated and created FraudNet [inaudible 00:22:58], which was something built so … In two seconds, [inaudible 00:23:04] is basically taking our best-in-class FraudNet and managing to improve the performance, quite dramatically, using machine learning.

But what was important in these products was that we managed to do that within the constraints of a certain timeline and without trying to break apart our product. Having very clear, real-time constraints. Having very clear, real-world, strategical constraints. It’s a strategic decision if someone is saying, “I have FraudNet. It works really well. I make it through an open architecture,” and you didn’t have that luxury.

That’s the idea. Talk with the line of business, with salespeople, people outside of your world. Honestly, the innovation, I am not a big believer of brainstorming. I am a believer of exchanging ideas. Brainstorming, there are even studies about that, so I’m not going to chant the virtues of something that I don’t believe in. I do believe in open communication with people who have real needs because then an interesting idea comes through. Then about 70 percent is actually changing radically what they’re telling me. Sometimes someone is telling me, “I want this,” and I know that that is impossible, but let’s talk about what benefit you are looking for. What is the need your clients have? Those are the kind of conversations. These are typically very focused meetings we’re having. We constantly touch base. And also very informal. This is helped by the fact that we have a strong relationship, also a personal relationship.

With Telco’s code we have done something similar. We started off with the idea, we worked with the company inside Experian that we partially own, Experian Micromalities, and they were doing what they call Dynamic Advanced Air Times. They predict, if you have a prepaid card, whether you are a good bet to give you a few dollars or top up, that you will pay. This is an idea that is very important in a merchant market, not so much in markets like the U.S., but for us in merchant markets it’s extremely important because paying these extra dollars might mean that you have to go to a kiosk, might meant that maybe it’s more convenient just to throw away your SIM card.

The next step for us was to say, “Since you have a relationship with Telco, you also have very rich data, you have CDR (call details data), but to do that you need machine learning, you need NLP to actually develop a new proposition.” Since then we figured out that you can actually monetize this data and give cash loans. Cash loans — again, this is not as familiar in the U.S. — are basically very small loans. It can be $100, $200, $500 for people who are unbanked. These are the bread and butter of financial inclusion.

Mike Delgado: And that can be huge, especially for small-business owners.

Luca Zuccoli: Exactly. It’s a gigantic business. It’s not only doing good things in the world for financial inclusion, but it’s also a real business, a completely different business model. But to do that, that’s where this concept of dealing with real-world constraints is extremely useful. Let’s say that I’m just an academician and I want to write a white paper, and I say, “With this data we’ll get this genie. It will be fantastic.”

This is the real world. The real world is you are the partner [inaudible 00:27:02] Telco. It takes one year and a half, two years to establish it. At the end of the day, do you have all the raw data, everything that you will want and dream of? Absolutely not. Particularly if you are in a merchant market. You have to deal with the data marked with the fact that they have some serious limitations. So you will get only some data. Then I have to deal with the fact that — I am dealing with the specific business owner, who maybe doesn’t have the budget or doesn’t have the infrastructure because he has to deal with legacy infrastructure to put together a big data framework. I have two choices. Either I say, “No, I want it. Let’s wait until you replace your legacy, or let’s have an [inaudible 00:27:47] cloud.”
This may take time. I have to weigh the pros and cons.

So eventually, what we did was try to figure out with a fast prototype where were the biggest gains. We managed to say, “Actually we don’t need data of data. We need just a few tera.” We can focus on certain attributes, and that will give us already 95 percent of the politic power. So this is what I mean when dealing with the real-world constraints. We created, eventually, two products for regeneration and cash loans, using the best methodology with machine learning, creating completely new attributes. But more importantly, they are in the market now. When it comes to this type of innovation, the value of something that works marginally better in one year and a half from now, and you easily get delayed by one year or two years if you don’t accept some constraints, if you don’t accept some limitations. If you don’t work in the ideal world.

Typically, these limitations are on the data, on the infrastructure, on the ability to use a more complex algorithm. If you accept these limitations, you go to market two years earlier. It’s a rule of thumb, might be wrong, it might be one year in some cases, it might be never in others.

If you do that, that’s how your innovation is relevant, because in two years the world has moved a lot already. In two years, someone else is already taking care of financial inclusion or someone else has penetrated your concept, and I’m talking about real-world issues. I’m not going to name one of our competitors in Vietnam, or should I? This is internal to Experian? Anyway, we destroy them, completely shatter them.

Mike Delgado: There we go. I love this. This is great. Go ahead, Luca.

Luca Zuccoli: But you know, Mike, the reason we did that is because at a certain point we realized they started off one year and a half before us and they had all the relationship with Telco, so they had more data, they had better things. They didn’t have our contextual expertise in [inaudible 00:30:17], so we managed to do some things likely different from them. We just finished the first POC. We absolutely destroyed them. Let me tell you what I mean by absolutely destroyed them.
The genie of our model was more than twice as high as theirs. With the same data, we managed to create better attributes, a host of different things. We get things in the high 40s, even 50s for some segments, and they get things in the 20s. So that’s what I mean with utterly destroying them.

Mike Delgado: Wow.

Luca Zuccoli: But the point is, maybe I could have done something even better if I had dreamed about not having constraints. My guys are used to working with me, and they know they are not going to live in the real world. That’s fine.

Mike Delgado: Yeah.

Luca Zuccoli: If I wait two years to get the ideal situation, you can be sure that these guys will pick up the right tricks, the right methodology, the right combo, the pieces that are missing, and then there is no way to do something that is marginally better than them and penetrate the market. So now, it’s also a matter of understanding how you get the innovation in the market with the right timing so that it’s still impactful. This is really what I’m striving to do.

Mike Delgado: Yeah.

Luca Zuccoli: And that’s why I love real-world constraints, because I think that’s where we can differentiate ourselves. When it comes to sheer methodology, I think for very strategic reasons lots of the methodology has been made Open Source. It’s very difficult to say, “My deep learning is absolutely better than yours, given the same amount of data.” So what I need to differentiate is the ability to impact the business, the ability to work with the right type of real-world constraints so that things will work and work really well.

Mike Delgado: I love that. So I know that we titled this “How to Leverage AI for Business,” but this conversation has just been fascinating, and it’s appropriate. It’s called #DataTalk for a reason. I just love that this conversation has gone from how Luca has developed a data lab, how he has hired his team, things that he’s looked for. I have found this incredibly interesting. I know, Luca, we could probably talk for hours. It’s been a blast chatting with you.

I have tons more questions, but I know that we need to go. For those who are listening to the podcast, if you want to get links to follow Luca or to connect with him on LinkedIn, the URL to go to is just: ex.pn/datatalk43. That will bring you over to our global news blog where this video chat is posted along with a transcript and links on how you can connect with Luca on LinkedIn. He has a great profile there. Make sure, if you’ve listened to this episode, to connect with Luca and let him know that you listened.

About Luca Zuccoli

Luca Zuccoli earned his Master of Science degree in Economics from the University  di Pavia, Master of Science in Statistics from Carnegie Mellon University and his M.B.A. from INSEAD. He serves as the Head of Analytics and the Asia Pacific Data Lab for Experian. He is fluent in English, French and Italian.

Check out our upcoming live video big data chats.

Never miss a blog post!

Subscribe to keep up with all things Experian.
Subscribe