Listen to the podcast:
Every week, we talk about important data and analytics topics with data science leaders from around the world on Facebook Live. You can subscribe to the DataTalk podcast on iTunes, Google Play, Stitcher, SoundCloud and Spotify.
In this week’s #DataTalk, we talked with Dr. Kirell Benzi about ways he is using data science to generate positive emotions through art. You can follow him on LinkedIn, Instagram, Twitter and Facebook.
This data science video series is part of Experian’s effort to help people understand how data-powered decisions can help organizations develop innovative solutions. To suggest future data science topics or guests, please contact Mike Delgado.
Here’s a full transcript:
Mike Delgado: Welcome to Experian’s weekly #DataTalk show, featuring data science leaders from around the world, and we are super excited. This is a brand-new topic; we’ve never covered it before. We’re talking about data art today, and our guest is Dr. Kirell Benzi. He’s going to talk to us about how he uses data science to generate positive emotions and, if you’ve not checked out his art, please do that today. His website is KirellBenzi.com. It’s spelled K-I-R-E-L-L-B-E-N-Z-I dot com. He has some amazing data visualizations; they’re absolutely stunning. They’re beautiful. A lot of work went into them, and we’re going to talk to Kirell about his work, how he got started. Kirell, thank you so much for being our guest today.
Kirell Benzi: Thank you very much for your invitation. I’m really glad to be here.
Mike Delgado: So, Kirell, tell us a little about your story, because what you’re doing is so unique creating data art.
Kirell Benzi: Three or four years ago, I was doing my Ph.D. focused on data science and most specifically network science. How do you use a network as a tool, network theory and so forth to analyze big data sets? Then I realized people were asking me what I am doing. “Can you show me?” I had no visual. It was hard to expand just using math or images. I realized maybe I could start using network visualization that then was used as a tool to explain for the paper and scientific research. To take the same software at the beginning and try to bend it so I could start to tell stories about the networks. It’s really about what I think. It’s data storytelling.
I realized that people were more attracted to the story than the math, of course. The idea was to create, hopefully, a beautiful picture, and then people would start asking questions. For me, it’d be a way to explain the data behind it. What does it mean? What can we read from the visualization? And so forth.
This is how it started. It’s an academic context, and then I realized maybe people would be interested also in … I’d say the general audience. I had this idea that maybe I could start doing art exhibitions about this. Being a science educator, if you wish, try and teach other people about this. If you talk about big data, they’re a bit scared. They say, “They’re going to steal my data” and so forth.
We saw that quite recently, but if you show them something completely different, they are, “What is this?” “It’s an artwork about Wikipedia that explains how people go and what they visit on the webpages. It’s completely anonymous, so don’t worry.” They say, “OK. That’s interesting. Why are you studying this?” Then you start to say, “We try to model the behavior of people, what they browse online and so forth.” Then they get hooked. It’s a nice way to redirect them to scientific problems, but using nice visualizations.
Mike Delgado: Tell us about your process, because it’s so creative. Like you said, you’re working with large data sets and building models. Working with the data. What’s your artistic approach to working with data?
Kirell Benzi: The first thing, as you said, it’s working with the data. I need to get a grasp on … Most of the time, I don’t have the data, so I need to create and build my own data sets. The best-case scenario for me is you gave me a nice database. It’s free. Everything is clean, and it’s easy to work with. That’s not the case. I’ve been getting crazy requests from scrapping websites, analyzing emails from the CEO of a company. He gave me all his emails, all the confidential information. I had to sign an NDA and say, “Can you extract the topics?” I was in Switzerland at the time. There are three official languages in Switzerland. People would answer in different languages, and it’s hard [inaudible 00:04:45].
This is something that I do exponentially, and most of the time I’m getting, cleaning and understanding the data from this more or less clean data set.
As you can see on my website, I work with networks. Most of the time, it’s actually not a network. You need to define … OK, if you want to a network, it points and links together by itself somehow. You need to define the relationship between the points. What would be interesting to visualize as a story? You have to think about this and say, “Maybe this kind of metric will do to compare the points.” Or maybe the network isn’t given. For instance, if you use social media, like Twitter, you can say, “If I follow someone, there’s an actual network.” You follow someone, so there’s a link, and it’s pretty easy to extract the topology, the structure of this.
Once you have your network, then you need to apply some tools on the network to extract the communities. Who are the big players in this network? Who are the followers? If you have a lot of connections, in terms of graph theory you have a higher degree. It means probably if you make an outwork, you should be at the center because you use models, like physical models. It means you attract more people around you. You need to identify who are the key players in this network, the different communities. You can also use the colors. In my outworks, all the colors mean something. It’s another method … Sign of the data.
There’s a dimension that is encoded in the colors, so you need to say, “What did I present that makes sense?”
Then you start to interact. The artistic process starts right here. You’ve got all the data. You’ve got all the charts and all the stats you can do about the data set. Then you start to interact. Sometimes it’s completely abstract. I’m sorry, no constraint. One of my last projects, I had a code writer say, “I want a volcano. Build me a volcano. Keep it as a data visualization so I can read the data from top to bottom and sort them by years.”
This was my grief and then I had to create around it. It was a very good exercise because sometimes you need some balance to be able to get your best work. It was very interesting.
Mike Delgado: That is very cool. As an artist, I think one of the difficult things you have to decide is when are you done with a piece. When are you finished? When do you let it go?
Kirell Benzi: Yes. It’s crazy. At some point you say, “I don’t have time anymore. I need to let it go because I’ve spent too much time with this.” But no, that’s not the right way. So you go back. For instance, I have this artwork that I did on Pokémon. Remember Pokémon GO was crazy at that time. The idea at that time was for me and my team, the coders that I work with at the university, to study virality. How is it that this phenomenon was so popular, one hundred million downloads in just one month? It was crazy. So I started to get the data from YouTube.
The idea was to fetch every video from YouTube that was talking about Pokémon. For each video, get every hour, the number of views, the number of comments, the number of likes, the number of dislikes over time and [inaudible 00:08:19]. Just in terms of bandwidth, this was one terabyte of bandwidth. It was crazy. The university sent me an email and said, “I don’t understand. You do as much traffic as half the school. What’s happening?”
They thought my computer was getting hacked, but it was just … I sent them a nice email, “Don’t worry. It’s for science. It’s for research.” Once I started collecting data, I created this kind of Poké Ball shape. You know with a Poké Ball, it’s something to catch the Pokémons.
Mike Delgado: Yeah.
Kirell Benzi: The idea was to create the general layout, representing the content of the generalization like the Poké Ball. Each dot would be a [inaudible 00:09:07]. At some point I realized it was nice to share with people, having fun with this, but I realized it was not artistic. It was more about fun and not about something we would put in a museum. Two years later, I came back and took exactly the same [inaudible 00:09:31].
I tried to twist the visualization, so it would still keep this spheric shape, but it would more look like something that would appear round. More clean, at least to me. And then now, I can put it in art exhibitions. Answering your question, I think that I have to restrain myself. I’m going back because I feel like improving, and I can always do better. It’s crazy.
Mike Delgado: For those listening to the podcast, you have to check out the Poké graph. Again, Kirell Benzi dot com. The Poké graph is awesome. It’s the shape of a Poké Ball. It’s just brilliant. It’s so fascinating. Tell me about how you go about choosing the colors for your visualizations.
Kirell Benzi: You have to think in the data visualization theory, something that I teach at the university. They say you should never put more than eight colors in the visualization. Otherwise, your brain is not capable of processing more than eight colors at the same time.
Mike Delgado: Really?
Kirell Benzi: Yes. In order to efficiently interpret the data or communicate in terms of different information. The thing is, I’m not doing data visualization. I’m doing data art. For me, emotion is the first thing. Second thing, you can interpret the data, but emotion has to come first.
I’m allowed to choose more colors, but I still try to keep the color palettes very consistent in terms of interpretation so hopefully you will not see mismatching colors. The have a meaning and they are fun.
The way I choose the colors … I’m using perception principles, how division works and how the shades of colors work. For instance, some artworks have the same illuminance; the value of your perception of lightness is the same for all colors in the artwork. It’s something that’s nitpicking, I notice. No one would care about this. But this is something … I try to choose something, color palettes like this.
Mike Delgado: Well, it shows. You’re an artist. You have the way that you’re going to do something, the colors that are going to appeal to and have certain meanings to you. I just love looking at the different visualizations you’ve created and the colors that you used, making data look beautiful. When you’ve had it on exhibition, some of your art, the reactions, the emotions people have seeing it. Can you talk about that? Because it’s a huge part of art.
Kirell Benzi: Yes, exactly. So I’m striving for two different emotions. First, as you said, I hope they think it is beautiful. People are attracted because they think, “It’s fancy. I like the colors. I like the shapes.” People have told me, “It’s beautiful.” Fine. The second, it’s not even an emotion. Something I look for is the cognition. You are attracted to the artwork, but then you start to read the description. You start to understand that it’s real data. It’s not just an artistic representation of data. So you can click, if it’s interactive, or either each dot, each line means something in terms of … It’s like a line or row in your Excel spreadsheet, for instance.
When people realize that, you see there is a spark in their eyes. “Ah, I get it. I could do this with my own spreadsheet using my data sets.” If you [inaudible 00:13:21], so that they become more aware of what is the data and that everything around us can be quantifiable in any way and used as a source to create an artwork. The conditions are met, some people told me that, “I don’t like data, but now I almost like it.”
Some people told me, “I feel smarter now.” I’m not bragging. I was happy when people told me that. I realized that maybe this is something they were looking for and it was a bit mystical — you know, big data, AI. You can use AI in every sentence. Everything you do is AI now and blotching. Now when people realize that they can see and understand what is big data or just data in general, I feel happy. This is something that I strive and look for when I do participation.
Mike Delgado: Where do you get the different visualizations I saw on your website. Very, very creative. Where do you get your ideas from?
Kirell Benzi: I don’t know. It depends. The thing is it depends also on such an intricate process, as I told you. If I have no grief, no constraints on the visualization, it’s something that I will … I try different visualization principles and then I, I don’t know. Shapes. I like abstract paintings and abstract art in general. I used to do abstract art when I was younger, when I was a teenager at some point. I realized this is how I reconnected with this side of me. Starting at 14, I started to do programming, computer science and so forth. I said, “I won’t do art.” Then I realized that I could connect and go back to the abstract shapes I was doing back in the day. I would say this is a bit cliché, but I would say nature. You know why? [inaudible 00:15:33] like clouds or the shape of a mountain, or anything, you would start to see patterns that you can find again in the outdoors. That is really cliché, so I don’t want to say it.
Mike Delgado: Now this next question is going to be a hard one because you’re an artist and every piece has certain meanings to you. Is there a favorite or a series of favorites of data visualizations you’ve worked on that you’re really proud of?
Kirell Benzi: This is a very nice question. I love everything that I do. You know, it’s like an imposition. No, I think in terms of the time it took to create artwork, sometimes it took me three weeks. For instance, the Wikipedia one, it’s called Secret Knowledge. It’s about Wikipedia. This took me eight months.
Mike Delgado: Wow.
Kirell Benzi: So when I look at it … Just because I needed more than six months to process the data, it’s just 5 million visits at Wikipedia English, keeping to English is 6 million articles and 300 million links between articles. I have all this tech with Wikipedia and on top of it all the visits of over a year to extract the patterns of what people were browsing. This took so much time and when I look at it, it reminds me of all the studies that we did, the findings that we discovered while studying the data sets. I think it is particularly close to me, to my heart.
Also ones about the Malta Jazz Festival. You maybe didn’t know, but this is one of the last days of the Malta Jazz Festival this year. Every year it’s one of the top music festivals in the world. My first artwork was about Malta Jazz. When I look at it … It’s all displayed thereof. When I look at it, it reminds me of why I first started to, as I was telling you earlier, try to show people what it’s like to be given data and how to represent it in a nice way. I think Secret Knowledge and When the Music’s Good, the two names.
Mike Delgado: I’ve gotta look at the Secret Knowledge one again after you just explained that. Eight months of work. Wow, that is a ton of work.
Kirell Benzi: It was the degree of the data science work, trying to understand. I told you the data size was so big that the network itself was 40 gigabytes on the [inaudible 00:18:15]. Just the network itself, without taking into account all the time series. The time series is giving you the number per visits, per hour, per page. So six different time series over a year, sampled every hour. It’s really big. It’s a terabyte of itself with visits.
Mike Delgado: Wow. Now aside from your artwork, you also teach data visualization. Can you talk a little bit about your teaching? As you’re working with students, maybe things that you share with them on how to improve their data visualizations to tell better data stories? We do have people in our community that are trying to do that. They’re trying to make better data visualizations. Can you share some advice for some of those out there listening in?
Kirell Benzi: Yes. Thank you for asking. For the students, it depends on their background. What I have is people coming from data science or computer science. They are really good in terms of programming, algorithms and so forth. They have a hard time in the design part. As you said, like the storytelling part.
The first thing I tell them is, “Your project will be graded on the storytelling part.” Sometimes it was good, like technical. It was good, latest [inaudible 00:19:50], with way more data processing, using the latest [inaudible 00:19:54], but the visual was awful. It’s a shame, but I guarantee you, no one will get it. If you put it on Twitter, no one will retweet it, because it’s ugly. You need to find a way to present the data that it’s pretty clear in terms of reading. They included a lot of text everywhere, different views that were not together. It was not making sense of the whole visualization. I would say … You know this cliché sentence. Less is more. You’ve heard this?
Mike Delgado: Yes.
Kirell Benzi: This is something we use in our [inaudible 00:20:31] functions, something we use in architecture. When you present something, it has to mean something. Sometimes it’s good to have a very complex length of views, but you need a way of explaining it before. What I like to do when processing my artwork is to put the artwork in the description below. It will not make sense for me to have my artistic work without a description. Otherwise, it’s just a pretty picture. People will think, “We can have pretty pictures with other generic art trends, but it doesn’t mean anything.” For data visualization especially, you need to be very careful of how you present. When you make the visualization, you choose a side. You cannot say, “I’m very impartial. I’m just presenting the data.” It’s not true. Inside the data, you can always tweak them so that you can say whatever you like. It’s OK. Just present what you want to say first, and then people will be [inaudible 00:21:29]. Does that make sense?
Mike Delgado: Yeah, it totally does. Are there any pet peeves or mistakes that people make in data visualizations, you see it and you cringe?
Kirell Benzi: Yes, there’s one mistake. It’s using pie charts. You know, “Death to pie charts” is a very famous saying is the database community. I’ve used a pie chart before, but everyone doesn’t use the pie chart. There’s something very true about the pie chart. As humans, we have trouble understanding and being correct about the angles. The pie chart is all about angles. The proportion, if I say 35 percent or 30 percent on a pie chart, is going to be hard for you to distinguish. If you have a lot of values, you have 10 different categories you want to put them in a pie chart, then they would be very small and it would be 1 or 2 percent. It’s impossible to read.
What people say is, usually use a bar chart, but maybe you line it up particularly so it’s something … You see the bars from left to right. It’s very good at distinguishing length, width and height. It’s better in terms of visualization to use bar charts instead of pie charts. The worst of the worst is the 3D bar chart.
Mike Delgado: (laughs)
Kirell Benzi: I don’t know like … With the [inaudible 00:23:06] you will see all the parts and the space. One of the good reasons for that is we humans also have a hard time distinguishing perspective and depth, especially because we work on a 3D screen. When you see a visualization, you have a hard time seeing what’s behind it. You can see something that’s bigger or smaller than what the data says. This is called the lie factor — how much the data lies and how you tweak the visualization so it’s not correct in terms of interpretation.
These are my two big … Otherwise, you can use whatever. They’re not [crosstalk 00:23:43]. Pie charts are.
Mike Delgado: With augmented reality and virtual reality, what do you think the future holds for data visualizations as people start to use … Do you think that it’s going to be improved or are you worried?
Kirell Benzi: I don’t know. I don’t believe in the 3D pie charts and VR. I really don’t. I think it’s going to be awesome for data art. It brings you emotion, like a video game, for instance. For now, I think the technology is not there yet. I have glasses, and it’s impossible for me to put the headset … It doesn’t work. You can see the pixels, but now the resolution is too low, but it’s going to improve, of course. At some point, it will be very immersive, especially with sound. For data visualization itself, if you work in 3D space, for instance, you want to visualize manufacturing parts. This makes sense. You see the product in 3D. You see how it fits in your car, but for 2D vision on a 3D headset, I don’t see the point in betting that. I’m [inaudible 00:24:56].
I don’t see the uses. I know people have been working on this, but would you really … If you want to show visualization, you link up with your team. You get a team meeting and you want to share it. Everyone should have a headset to see the visualization. It would be better just to have a beamer in, and you project it. You see what I mean? I’m not sure about this.
Mike Delgado: What tools do you recommend for those who are just starting out with data visualization. What tools would you recommend that they get to know and use well?
Kirell Benzi: I think it depends on your background. If you are into the tech-savvy, you know how to program, you can directly start with D3.js, data-driven documents. It’s the most famous JavaScript library to do web visualization, but you need to know how to program. There’s also a steep learning curve. Otherwise you can use Tableau, for instance. Actually pretty good. You can create dashboards very easily and make the Tableau public. Now you can create the visualization from the computer and upload it so people can view it online. You can talk about it, share it, comment. This is pretty good. The thing is, of course, if you use a software like this you’re a bit limited in what you can do with the visualization, because if the visual item does not exist in Tableau, you cannot do it. If you can program, you can go beyond this, create your own shapes or your own visuals. It’s harder. You need to spend … It’s takes longer than just using the off-the-shelf random creating bar charts, for instance.
Especially if you do web visualization, you will have most of your time about the data. How do you feed the data to represent something that runs in your web browser? Sometimes you need a server somewhere that will stream the data. How can you make the request so that’s efficient, it’s secure, things like this? You would spend more time doing the whole infrastructure than just creating the visualization.
Mike Delgado: You mentioned that you are teaching your courses, you’re grading your students based on how well they tell the data story. Can you talk a little bit about what is it about visualizations that can help better tell data stories? Things that you’re looking for from your students?
Kirell Benzi: Yes. Well, it’s not only about data story. It’s also graded on different criteria. One is technical [inaudible 00:27:39], so whether or not the code is pretty clean. I like to read this. As you said, one of the criteria that I look for, is it appealing? I can always study the nicer New York articles, for instance, with interactive data visualization included in the article. We learn basic stuff, as you know, because this is what you do. Where is the beginning then we go and [inaudible 00:28:03] travel, like a red line usually, to follow something, like a trail in the visualization. Not put everything at once at the same time. You can use the [inaudible 00:28:15] glass item. People will have everything that they want and then you can scroll to different slides. You see the visualization coming to life as you move on. You can have a big story but with just small charts inside that are interactive and fit the story.
Or you can have a full dashboard where everything is interactive, but it’s harder. It needs to be very intuitive. What I usually recommend is to put the smaller introduction … So the first time you go on the website, there’s smaller panels. This is a visualization about this. This is how we can use it. We can scroll. We can pan. We can turn around and so forth. Then you can go and drill down in the data. It depends on the kind of story. If you’re more about data journalism, you probably need more text and then small, very impactful visuals. If you create a dashboard for a company to monitor their retails or you have Google Analytics, for instance, monitor the number of visitors at your website, it’s OK. Once you know how to use the web visualization, you won’t need any text. It depends on the data set.
Mike Delgado: Friends, we are talking with Kirell Benzi, and I want to highly recommend that you check out his data art. Read the stories behind his art. Again, the URL is K-I-R-E-L-L-B-E-N-Z-I dot com. Dr. Kirell, for those that want to reach out, get to know you, what is the best way for them to contact you?
Kirell Benzi: Any social media, really. I’m on Twitter. I’m on Instagram, Facebook. Everything is fine. It all goes in my mailbox anyway. Email if you want. Probably using the social media. If you want to follow some of the artworks and comment, that would be awesome.
Mike Delgado: Wonderful. On our Experian blog, I’m going to put links out to your social profiles, so they can follow you there and connect with you. The Experian blog, where we’ll have a full transcription of today’s show, the video and links to different ways to get the podcast, is just ex.pn/datatalk54. That’ll bring you over to the Experian blog, where you can get links to learn more about Dr. Kirell Benzi and also check out his artwork and the beautiful stuff that he’s doing to make us have emotions around the data. It’s a lot of work. It’s stunning. I highly recommend you guys check it out. Dr. Benzi, thank you so much for being our guest today.
Kirell Benzi: Thank you very much for an invitation.
Mike Delgado: Take care.
Kirell Benzi holds a Ph.D. in Data Science since 2016, which he obtained at EFPL (Ecole Polytechnique Fédérale de Lausanne) where he teaches data visualization to over 180 master’s students. He became passionate about digital art at a young age, and by the time he was a teenager started exploring digital creation by making abstract visuals based on photos that combined both art and computer science.
His expertise in software engineering and creative coding have allowed him to design new methods specially dedicated to his passion for data. Consequently, his work has been shown on over 100 websites in 10 languages, including Gizmodo, Engadget, Daily Mail, The MarketPlace, TechRadar, Co.Design, Phys.org, VICE or Digital Trends for his analysis of the Star Wars universe as well as for his interactive visualisations of the data surrounding the viral Pokemon Go phenomenon.
Check out our upcoming data science live video chats.