Listen to the podcast:
Every week, we talk about important data and analytics topics with data science leaders from around the world on Facebook Live. You can subscribe to the DataTalk podcast on iTunes, Google Play, Stitcher, SoundCloud and Spotify.
This data science video and podcast series is part of Experian’s effort to help people understand how data-powered decisions can help organizations develop innovative solutions and drive more business. To suggest future data science topics or guests, please contact Mike Delgado.
In this #DataTalk, we had a chance to talk with Nadieh Bremer about the steps to creating effective data visualizations to tell better stories.
Mike Delgado: Hello, and welcome to Experian’s Weekly Data Talk, a show featuring some of the smartest people working in data science today. Today, we’re talking with Nadieh Bremer. Nadieh studied astronomy at Leiden University and the University of California, Berkeley. She’s our very first astronomer on Data Talk. This is a very special episode. She also specialized in cosmology while she was in graduate school.
Nadieh has won countless data visualization awards, including the best individual award from the Information is Beautiful awards in 2017. She’s also the winner of the urbanization challenge from the World Bank and visualizing.org and countless others. You’ve definitely got to check out her blog; I’ll put the URL up in a minute.
Her data visualizations have appeared all over the place, including The Washington Post, Scientific American, Wired, The Atlantic, Vast Company and many others. Today, we’re talking about how to create great data visualizations, and it’s an honor to have Nadieh as our guest. Nadieh, welcome to Data Talk.
Nadieh Bremer: Thank you. Thank you very much for that introduction. Thank you for having me.
Mike Delgado: Nadieh, we like to always ask our guests what your path was to working in data science. Can you share with us your journey?
Nadieh Bremer: Yeah. I started out as an astronomer. At the end of my … sort of the point where I needed to decide if I wanted to continue, I figured out that that was not what I wanted to do. I wanted something more dynamic and more back down to Earth.
I did everything. I did in-house days with banks, with retails, with fast-moving consumer goods, but I also came into contact with consulting. I signed up for a business course with Deloitte because it was in Barcelona. I said, “Why not?” It was actually with the strategy department, and while I was there the business intelligence department told me, “We’re actually starting up an analytics department, and we think maybe that could be something for you.” They told me what it was about, and I thought, “Well, that’s even better. That’s making intelligent decisions based on data and still being able to analyze the data and do things with that.” No longer about the stars but now about what people buy or the mortgages people have.
That really appealed to me. That’s how I got started with Deloitte as a data scientist, mid-2011.
Mike Delgado: What was that experience like? First job out of college, you’d been working in academia studying stars and now you are working with data. Tell us about that transition.
Nadieh Bremer: I’d been working in programming during astronomy and working with data and programming during astronomy as well because I was more on the theoretical side, so I was working with simulations and analyzing the data there. But the difference that came when I went to Deloitte was twofold. The first few weeks I just brushed up all my machine learning knowledge. I read papers about perfector machines and self-organizing maps and whatnot. When I started out, you’re just the new kid and you don’t really get started on a project. So at least I got time there.
But also the fact that I was going to the clients and you have to be presentable to the clients. Just the team dynamic. I like the fact I was in a small team, usually three or four people of Deloitte within the big client’s company. That made you knit together very fast and become a team. That team would change every few months, and that’s something I liked as well. Every few months I’d have a completely different set of people that I would work very, very closely with.
Mike Delgado: That’s very cool, because you can see how flexible you are. Because sometimes people get very comfortable with teams, and then when people leave it can be very hard to adjust. But the fact that you enjoy working with different people, that’s wonderful.
You won countless awards for your data visualizations. You’re published all over, doing amazing work. Can you tell us your process for wrangling the data and then helping to figure out what the story is going to be?
Nadieh Bremer: Of course. The first thing I need to know or I need to figure out is the question or the goal. What does this data have to tell? What should people learn from this data or should it have a goal; for example, should people be convinced and sign a petition? And once I know the story or the angle, that usually comes from either the clients, or if they don’t quite know it yet I just go through our end and make all kinds of plots and together with some of the initial thoughts the client might have because they are experts on that data. I try to find the story, and with that goal I can then go into figuring out what kind of visual representation would show that in the most effective but also interesting way.
For that, it’s a combination of experience. So having made hundreds, maybe even thousands, of visualizations in the past — some super, super simple ones through to more advanced ones and sort of knowing what kind of works well. But I also keep Pinterest boards of the things I find inspirational with data visualization. I have a Pinterest board on radial ones, I have one that’s about geo, and so on. And before I start on one of the bigger projects, I look through that to get inspired and have a collection of possible options in my mind.
Then I start drawing on plain paper, trying to see what sort of sticks and what I think would work for this and try to make at least two or three different versions. Because it’s typically not the first one you come up with that you think would be the best one in the end, so I try to force myself to make a few different ones.
Mike Delgado: It must be hard when you get attached to a certain visualization and maybe your client says, “Try again.”
Nadieh Bremer: Yeah. That is true. That’s also why I start out with just sketching on plain paper. When I present the findings back to the client it’s still on paper, although I might have made it a little bit better, a little bit more professional, but still sketched so it’s rough. Then I’m fine with whatever they choose. Typically I might have a preference and I can tell them the preference, but they might choose something else.
Once we were already in developing, so already making the visualizations with D3 and having spent several hours on it, and the client felt this shape is displaying a political statement they didn’t want to have. Maybe not political, but it was giving sort of an extra effect they … and the data was sensitive, so they needed to be extra sure that would not happen.
Then I just redesigned it. But that only happened once. Typically, after the design phase, clients understand. They have made their choice and nothing major happens afterwards.
Mike Delgado: So Nadieh, you start with a goal, a question that you want to answer. You begin to dig through all this messy data to find the story. Have there even been times where as you’re going through the data you’re uncovering another story the data is telling you that you relate back to the client?
Nadieh Bremer: You mean a different conclusion than my client wanted it to be?
Mike Delgado: Yeah.
Nadieh Bremer: Let’s see. I think it probably happened during the first years when I was still at Deloitte. I’m not sure if it happened recently though, because then the clients hire me specifically for the visualization of the data instead of the data analysis. I do less of the analysis these days. But I think that happened. Very often, the client has hypotheses and you try to investigate that. But that might not be there. What we typically did is make it extra, extra clear why that isn’t the case. So instead of just saying we didn’t find that, you have to dive into it and really convince the client that what they thought wasn’t true, wasn’t true.
Typically, you do that by first showing them something from the data that makes sense to everybody. So the data shows this, or algorithms show this, and everybody agrees on that and then you can show, well, but, we didn’t see this — so that the client sees the algorithm seems to be correct. They have that confidence in it, and then you can show them that not everything we thought would happen is actually happening because blah, blah, blah. Instead of just showing them immediately this is not true.
Mike Delgado: What I love about you is you have this very analytical coding side, but then you also have this very artistic, creative side, and it’s unique to find individuals like you. I love the fact that you said you go to Pinterest to get the creative juices flowing, start to think about different ways to present data. Is your Pinterest board public so other people can check it out?
Nadieh Bremer: Yeah, definitely. I also follow other people.
Mike Delgado: Okay, cool. So what I’ll do is put this URL on the screen. For those listening to the podcast, the URL that will redirect over to Nadieh’s Pinterest account will be ex.pn/nadiehpin. That’ll bring you to her Pinterest account for you to check out to get those creative juices flowing. I think that’ll be helpful for other data scientists looking for data visualizations. Very cool.
I love this, hearing your process. Can you share with us maybe some favorite examples of data visualizations you worked on?
Nadieh Bremer: Yeah, of course. One of my favorite ones recently was a big project I did together with The Guardian called Bust Out. It was about homeless people. The Guardian had already found out that homeless people were being sent to other places with a one-way bus ticket, but nobody had investigated what happens to these homeless people once they reach the other side.
So they requested all kinds of data from the different bus programs in states from the Freedom of Information Act. Then I was asked, and I brought in Shirley Wu, who I’ve been collaborating with a lot, to tackle the data analysis side. This was a special case in that they wanted us to create the visual elements to help with the journalistic story. I quite liked that one just because the impact of the story is showing people that this is a thing that’s happening and sometimes it works, but more often than not it actually doesn’t work. So I like that one.
I think another recent favorite is I like to do visualizations in my personal time about things I’m very crazy about. This was a Cardcaptor Sakura. I don’t know if you’d know …
Mike Delgado: Wait, what?
Nadieh Bremer: Cardcaptor Sakura. I probably say that wrong. It’s a manga, so a Japanese comic in a way.
Mike Delgado: Oh.
Nadieh Bremer: I read that during my teens, and now they started up a new arc in a way. So I was completely into that. I wanted to just visualize the things I read during my youth, so I made a gigantic visualization about all the chapters and the characters and I investigated the main colors from the covers of each chapter. So it became quite big, and it’s such a thing for a niche market in a way. There’s not even a market. There’s just a few people who are as crazy about it as I am and they love it, and I’m fine, I’m happy with that.
Mike Delgado: Alex says, “I love that show.”
Nadieh Bremer: Oh, great.
Mike Delgado: I’m totally out of it. I totally did not … I have not followed that. Is this on your blog, by the way?
Nadieh Bremer: It’s not on my blog, but it’s in my portfolio. You can find it on Data Sketches.
Mike Delgado: Oh, okay.
Nadieh Bremer: It’s part of that project.
Mike Delgado: Okay, cool. I’ll make sure to put a link in the comments of the Facebook video as well as in the YouTube video so people can go check it out. Which reminds me. You have this amazing portfolio. You did a presentation a while back. Was it a year ago? ‘Hacking the visual Norm’. You gave a talk in Amsterdam, and you had a beautiful slide deck on GitHub. Can you talk a little bit about that presentation? Because it was beautiful what you produced.
Nadieh Bremer: Thank you. Yeah, it was not your normal presentation. It was all programmed so every slide was its own HTML webpage. I used reveal.js to then have a beautiful way to flip through each webpage as if it was a presentation. The first time I was going to speak at an international conference, I knew I wanted to do something special. I wanted to show my visualizations in a way that best represented them, and I also wanted to be able to interact with them.
I knew videos were not going to work, and after searching around a little bit, I found reveal.js, which lets you make the presentation built out of websites. Because all of my visualizations are built up off JavaScript base, I could use that. I could make the visualizations work inside of the slides, so I could interact with the visualizations on the slides, but I could also make it animate. I can press the next button and then all of the circles move from one side to the other side, as if you are triggering an event listener in a way that does a click, for example.
So I made this slide deck in a way. It took me a month and a half of all my free hours. I can’t do it really fast, but yeah, it was very well-received so I was truly very happy about that.
Mike Delgado: What was one of your goals with that presentation? You called it, ‘Hacking the Visual Norm’. What was one of your key messages to the data scientists and artists out there?
Nadieh Bremer: That you should try to do the extra step to take the visualization beyond the default settings, because every data set is unique and has its own quirks that require adjustments to a default layout that you may have chosen or a chart form that you might have chosen to make it even more impactful or effective for your audience. This could be choosing a better color palette or thinking about the chart form. So instead of going straight for the bar or line charts, looking at what other people have done and maybe going for a bubble chart or a force layout. Just not being satisfied with just putting your data into an example and then being done.
Mike Delgado: You worked on a project that I was reading about on your blog with Zan Armstrong. This was for the baby spike data visualization that you did for Scientific American magazine. What was it like partnering up with Zan? And can you talk a little bit about that process?
Nadieh Bremer: That was super. It was great, because they actually asked Zan because she had already talked about this seasonal trend that happens when babies are born. So it’s not averaged. There are trends throughout the day, throughout the year. And then Zan asked me to help her on the visual side, and I just happened to be in the U.S. for three weeks when were supposed to be creating it. So we had two half-day blocks where we were together and we investigated the data, first figuring out what the angle of the story that we want to tell is and then making a rough design for the visual side.
But she did the data part, so she had found the data, she had processed the data, and made some plots in R and everything, and then I took it more towards coding that up in D3 and Illustrator and then finishing the things up. But just the fact that I could go back and forth with her. I would have an idea and I would throw it at her and she’d tell me yes or no or what her thoughts were and vice versa. I think that was what made it go to that extra level, because sometimes you can get so stuck in your own thinking that you’re not seeing that there could be a better way or what you think is obvious isn’t obvious to somebody else. So, I guess it makes it a lot of fun to be able to work together.
Mike Delgado: How did you end up deciding on that particular visual? I don’t know if it’s possible in podcast form to relay this visualization, but how would you describe it and how did you come up with that baby spike image?
Nadieh Bremer: The baby spike is all about deviations from the defaults, from the average number of babies born. We wanted to show that, especially on a daily basis, you have a gigantic spike of babies being born around 8 a.m. due to C-sections. But there are more nuanced things.
Visualization is in a circle because it’s time. Therefore, we show a day and a week and a year, and they all are circular. We have sort of a light chart going around a circle, but instead of the base being zero, the base is the average value. That means if the line dips below the base, the stuff between … I don’t know if I’m explaining this clearly.
Mike Delgado: It’s very hard.
Nadieh Bremer: It’s all in visualizations. The stuff between the average and the lower than average is filled in with blue, and if it’s above average, the stuff between the average line and the height above the average line is filled in with red, in a very general way. If you see the circular chart, you can really easy see that’s when less babies than average were born. That’s when way more babies than average were born. That’s the general visual side.
We decided on that … well the circular thing came because the day is circular. So you want to connect the ends back together because there could be a trend going over that happens during midnight. If you just pulled it out, they seem very far away, whereas they’re actually not. And then the average line doing that reddish bluish thing, that came because we had … At the start, we figured out that the goal of this visualization is to show the deviation from the average. So the average should be prominent in some way.
Of course, when we started out making visuals in R, we set the baseline to zero and we started out with just straight line plots. But by just playing around and having that goal in mind, we eventually came to this idea with the circular with the base being the average line.
Mike Delgado: And for those who are listening to the podcast or those watching the video afterward, after this is live, the URL to check out the baby spike, I’m going to create this redirect right after this video show. It’s ex.pn/babyspike. That’ll bring you over to Scientific American, where you can actually see what Nadieh just explained. It’s so difficult to explain, but you did a very good job. Very, very insightful to hear that, and I love how you’re so detailed with the color choices and the color scheme and all that.
You mentioned also in a blog post — I wanted to ask you because I love when you write about art and the creative process — and you said that, I think it was during the baby spike design, you did something and sometimes the design just doesn’t feel right. And how one version of your data visualization felt too polished. Can you talk through that, what that means?
Nadieh Bremer: Yeah. It’s something that comes completely from the creative side, but I can’t put it into some sort of objective number or statement, but I look at it and I feel that it’s just not right. And for this case, I had filled the area below and above the average with gradients — say from light reddish orange to dark red. I was looking at it and it looked too perfect, because SVG’s, the things I create my visualizations in, they’re super, super sharp no matter how big or small you make them because they’re not fixed pixels, but they’re really shapes.
I can’t even explain why, but it was just not good enough. I felt there was a better way to do this, to give it a little bit more depth in a way than these two perfect shapes of blue gradients and red gradients. Sometimes I have the feeling and I am unable to find the solution, and sometimes I have the feeling and I just keep on trying or experimenting with different things and then I do find something that works better. Then I can say, “Yes, this is much better and I’m happy with this.” I wish I could say that there was a formula that I could use …
Mike Delgado: Yeah. It’s great because it’s your artistic side, it’s what you feel as you’re working. And you’re balancing your data side, your analytical mind with your artistic side. There’s a battle going on.
Nadieh Bremer: Exactly, yeah.
Mike Delgado: What would you say is some of the most challenging work for anybody who works in data visualizations? And in those challenges, what keeps you motivated to keep pushing forward?
Nadieh Bremer: I’d say they are two main challenges. The first one is figuring out this abstract design. So not looking at colors, but just how am I going to lay out my data? Is it going to be circles or curly shapes? So this is very abstract design. That’s the first challenge.
The second challenge is just browsers and mobile versus desktop and the fact that you almost have to make two visualizations: one that works on a wide screen and one that works on a teeny, tiny screen. Then there’s some things that work in Chrome that don’t work in Safari, or sometimes even IE needs to be made able to work.
The last thing I can say that’s not actually something I enjoy that much because it has nothing to do with the visual side, where I get most of my enjoyment from, it’s just the technical side. It’s more that I can do the visual side. I’m fine with picking up all of these browser bug thingies and mobile and desktop thingies that keep me going.
And for the design, I guess it’s just a challenge. The idea that you are trying to make something and that eventually you are happy with what you’ve made. People are actually understanding the data better through what I’ve made, and they can also say, “Oh this looks intriguing and I can also understand the main points.” And they can find their own style. I like visualizing lots of data where there is one main story but people can play with it or look at it more deeply and then find other stories they might be interested in. That satisfaction you can get from having done your job well.
Mike Delgado: No doubt. Do you have any favorite tools you like to use when developing these data visualizations?
Nadieh Bremer: Yes. So for data analysis and preparation, R. I use a lot of R. And then I start using D3, which is JavaScript library but I use Visual Studio to program in. Not Visual Studio … no wait. I always say this wrong. Visual Code?
Mike Delgado: Don’t worry.
Nadieh Bremer: It’s the smaller one, not the actual DBA one from Microsoft. So it’s sort of something with Visual Studio Code and mine is one of the two second words. It works on a Mac and it’s really easy and not overly full of options, but just the right things I need for more of a front-end programmer. I need to use all the net browsers, of course, but I prefer Chrome. And then I use Illustrator, because some clients also want their visualizations to be turned into a poster or a static thing, so I take … The New York Times has developed a small tool called SVG Crowbar. It’s a link you can put into your Chrome extensions. You click it and then it downloads an SVG off the visualization I’ve made in the browser.
Then I can open it up and in illustrator. I always have to make adjustments because not all of the settings go through, but at least the most important ones go through.
So I think those are my three main things: R, D3 in Visual Studio and Illustrator.
Mike Delgado: Awesome. And for those who are listening to the podcast and want to get those tools, we’ll make sure to put them up on our Experian blog, and the short URL for that is ex.pn/nadieh. And we’ll have it under the Resources section of that blog post, so you can check it out. That was very, very helpful, Nadieh. Thank you so much.
One last question. Poor data visualizations can not only hurt our eyes, but they can also, even worse, tell an incorrect story. Can you share what kind of data visualizations mistakes bother you? For instance, when you’re looking at the paper or you’re looking at a website, I’m kind of curious, what are some of your pet peeves?
Nadieh Bremer: The most obvious one that comes very often is the bar charts, and that’s starting at zero because you’re just blatantly sort of lying in a visual way. But maybe more nuanced and small things. I personally don’t like it when I see a default color palette. For example, Tableau and Clip View have default palettes, and when I see that I feel people have not paid attention to the data. They just plugged it in and moved on.
But other things in terms of lying with data, it’s using the wrong chart for time series data that is continuous; it should be in a line and not a bar chart, because it’s not discrete. These kind of smaller things, but I guess pie charts with too many slices. Or pie charts in general, maybe.
Mike Delgado: I always hear from data visualization artists they despise the pie chart.
Nadieh Bremer: Yeah, I don’t think I’ve ever used it. I’ve gone for maybe a donut chart once, but never a pie chart. It’s more in the way that just don’t use a pie chart until you know the rules well enough that you know when to break them. So it’s a little bit of nuance to that, but if you do not feel that comfortable, just pick something other than a pie chart.
The thing is, it’s just difficult for people to see the angles. They are much better if they see that in either a stacked bar chart or a normal bar chart. It is much better for people to see. So even with a pie chart you can be slightly … how do you say that … deceiving your audience and especially with 3D. So do you know in Excel, there’s this 3D cone chart that’s just plain awful? I was looking up for a vacation the weather in a city that I was going to and they put it into a 3D cone chart comparing the city to another city in that country. And I was just like, this icon, you can’t even see if the temperature in one city is bigger than the other because they’re in this weird 3D shape.
I feel like if people even looked at this chart, wow. So I guess, yeah. 3D without adding extra information is just … gets on my nerves.
Mike Delgado: But it looks so cool, Nadieh. 3D is just where it’s at.
Nadieh Bremer: Yeah. No I know what you mean, yeah. Don’t listen to me. I’m just boring with 2D flatland.
Mike Delgado: I have to tell you, after this conversation I’m never going to do a pie chart ever again. I’m scared about using a pie chart.
Nadieh Bremer: That’s good. Let’s put the entirety of this community on you.
Mike Delgado: Yes, yes. Going to send you an email, Nadieh, is this okay? So anyways, we’re at the end of our time here, so just a couple quick questions. What is your favorite programming language and why?
Nadieh Bremer: R.
Mike Delgado: R?
Nadieh Bremer: Because I really like the fact that you can run certain sections. You don’t have to compile everything. You can run something and then run a line all the way at the top and then run lines that are way at the bottom, and there are packages to do practically anything. Anything somebody has made a package for in R. So that’s why I like it a lot. It gives me the fewest headaches.
Mike Delgado: The second and last question: What advice do you have for people who are interested in getting started in data science?
Nadieh Bremer: Be curious and know that you have to probably invest your own time into getting to a good level, especially these days with data science having so many people in it. You should not expect you will learn enough on the job. Take that course on the side. I know it’s hard, but try to get your level to a better place and do that by doing these kinds of personal projects. Pick a question that motivates you, that you are curious about and then try to find an answer through data science. Figuring out what algorithms you use to actually get an answer to that question you want to do.
It helps because you both learn something that you always wanted to know, and your skills get better and you can use those skills in a business environment later on. It’s also good because you can build up a portfolio, which might be interesting also for data scientists to be able to show they have a small portfolio of data science–related questions they solved and how they solved them to show, for example, during a job interview. It’s not portfolios for visual things. You can also have portfolios for skills instead of just saying that you can do R and Python, showing that you can do it.
Mike Delgado: That’s great advice. Nadieh, thank you so much for being our guest. Where can everyone learn more about you?
Nadieh Bremer: I have a website called visualcinnamon.com, because as you already figured out, my name is impossible to spell. It’s something that people would find more easy to spell. So visualcinnamon.com. I’m also very active on Twitter, where it’s NadiehBremer, and also on Instagram, same handle there.
Mike Delgado: Wonderful. We’ll make sure to put links in the YouTube About section as well as in the comments of this Facebook video. For anybody who’s listening on the podcast, they can go over to ex.pn/nadieh, spelled N-A-D-I-E-H, and I’ll provide links to all of her social handles, including the work she’s doing on GitHub, so you can follow her there as well. Nadieh, thank you so much for sharing your insights with our community. Been a blast chatting with you. So much fun. All of the way over there in Amsterdam, I’m here in Costa Mesa, California. Thank you so much for your time, and looking forward to keeping in touch.
Nadieh Bremer: Yeah, thank you for having me.
Mike Delgado: Yeah, take care.
Nadieh Bremer: Bye.
Mike Delgado: Bye.
After graduating as an astronomer from Leiden University, Nadieh Bremer became a data scientist finding insights in the vast amounts of data that are hidden within many companies. It took a few years, but she eventually figured out that she loved the visualization of the data and insights even more than the analysis itself. Since then she’s been focusing on and experimenting with the more creative side of data visualization. Check out here data visualization called “The Baby Spike” in Scientific American. She shares her insights at: Visual Cinnamon.
Make sure to follow Nadieh on Twitter, LinkedIn, Pinterest and Github.
Check out our upcoming live video big data discussions.