Visualizing Data

We all know that visualizing data is an important part of Data Science.
If it is done wrong, it can be boring not grabbing the attention of the readers, or even worse; convey the wrong message.
If it done correctly, it can intrigue even the most indifferent reader (some people can even turn Data Visualizations into an art form).

I personally think Python’s matplotlib is a great library for data visualization. Another amazing library is D3, which is very intuitive and flexible like matplotlib. In addition to that it is a javascript library so it works in the browser, which makes it is platform independent and you dont have to install any software. Did I already tell you D3 is a.. maa.. zing!?

That is why, I will focus on Data Visualizations with D3 in the future. But for now, I will start with something simpler and show you how to make a choropleth map. This is the kind of map you see at every election, where each state is colored in the color of the winning party. Although it might seem difficult to make such a map… it is not.


Step 1: Get a Map

First thing we need is a map of a country (or the area we want to visualize) in SVG form. Wikipedia has a nice collection of blank maps we can use. Copy the code of this map in a <div> element of a basic html page. As an example, we can take this map of the world.
Another thing we need to include is the jQuery library, so go ahead and link to the latest jQuery version hosted by google like this:


If we open the page now, we should see the map drawn out.



Step 2: Color the map

As we can see in the code, each <path> element has its own id. The code for  Australia for example looks like:

Sometimes we might be lucky and this id will actually be equal to the name of the state/country and sometimes it might be a random number/word. If that is the case, dont lose any sleep over it. It is not very difficult to discover which element belongs to which country with Chrome Developer Tools (right click on the country and then click on ‘inspect element’).

Now that we know the id of the country we want to color in, we can give it a color with  the javascript code:


Now we need some data to fill in the map. Since the war in Syria / the Syrian refugee crisis is a current issue, it might be interesting to see which countries are donating the most / least to the Syrian crisis. The data for this can be found on this website. We could chose to color based on the absolute amount of money, but it does seem more fair to look at this donated amount relative to the countries’ GDP.
If we divide the donated amount by the GDP of that country of that year, we will get this data. Now we only need to put it in the correct format, which is JSON.

In our example, the correct data in the correct format looks like:


The complete dataset can be downloaded from here. In this file each number indicates the donated amount as 1/1000th percentage of the annual GDP. Go ahead and place the data in a <script> tag so that it can be accessed by JavaScript.
You can check whether or not the data is recognized by the browser, by executing console.log(data["Switzerland"]) within a <script> tag. This should print the data for Switzerland in the console of the browser:




Now the entire map can be filled in with a javascript function which iterates through the variable containing the data:



With the correct colors filled in, the map looks like:


In the above map all of the countries with no donations for the Syrian crisis (in 2015) are colored red. The countries which have donated money are colored in based on a blue to green gradient, where blue indicates a relative low and green indicates a relative large donation (Russia ~ 0.5 / 1000 % of GDP and Canada ~ 11 / 1000 % of their GDP).
If you are interested, you can download the entire html file from here.


Step 3: Choosing the correct colors

Now that I have covered the basics of making a choropleth map, I want to address the issue that the way you choose to visualize your data can have a huge impact on the message your visualization is conveying.

If the countries with no donated money were left untouched, the first impression of the visualization would be there is no data available on these countries.



Choosing the gradient scale from red to green instead of blue to green, conveys the message that the countries colored with red have done something bad (red is associated with danger).


Although I think everybody can donate more, I would not want to give the impression that Brazil has done something bad by donating ‘only’ 5.000.000 USD.


Share This:

Geef een antwoord

Het e-mailadres wordt niet gepubliceerd.