Activity 3-3

Data Visualization

In this activity, you'll learn how to create a more advanced data visualization in plotly called a choropleth graph. Choropleth graphs are a good way to visualize geographic data. We'll be making one similar to the one below.

Task 1:

Begin by downloading the stencil code here. Be sure to rename the file to Also, download the state agriculture data set here.

For this activiy, we will need to import both pandas and plotly. We can do this as follows:

from plotly.offline import plot
import plotly.graph_objs as go
import pandas as pd

Place these three lines of code at the top of your program.

Create a function called states_graph that does not take in a parameter and remember to call it from your main function. You will use this function to create the visualization shown above. Your first step in this function should be to read in the csv file above to a Pandas DataFrame

Task 2

To build the choropleth, we need to create two objects: data and layout. Data specifies how we want our data displayed and layout specifies what we want the overall visualization to look like.

Let's create data!

Will begin by creating our choropleth object. The basic Choropleth object requires a parameter for the column of data corresponding to the column codes.

choropleth = go.Choropleth(locations = data_frame['code'], ...)

Since these codes correspond to US states, we'll need to specify that using the locationmode parameter.

choropleth = go.Choropleth(locations = data_frame['code'], locationmode = 'USA-states')

Lastly, we'll now need to specify the z parameter, which is the dimension of data that will be visualized on the color scale

choropleth = go.Choropleth(locations = data_frame['code'], locationmode = 'USA-states', z = data_frame['total exports'])

Copy and paste this last line of code into your program

Task 3

Now, lets create the layout! The Layout object is similar to that of the basic Plotly graphs we created, but we need to add a geo parameter. This parameter contains information about the scope of the map, which describes how much of the world should be plotted. We'll also add the projection, which tells plotly the projection we want to use. The Albers projection finds a good compromise between shape and size distortion.

layout = go.Layout(title='Map of US Agricultural Exports', geo = {'scope':'usa', 'projection':{'type':'albers usa'}})

Task 4

Now that we have our data and layout specified, we can create the plot! Based on what you learned in the previous activity, create a figure, plot it, and run your program. Your graph should look like the one above.

Task 5

You can modify the color scale used by Plotly, by passing in your own. For our graphical objects, a color scale is a list of lists, where each individual list contains a fraction value at which a named color applies. This color scale below sets the lowest value to purple, the mid-value to white and the top value to yellow. You can add as many colors as you want, giving them different fractions to correspond to. Keeping it simple however, is best for pleasing visualizations.

color_scale = [[0.0, 'purple'], [0.5, 'white'], [1.0, 'yellow']]

Modify your choropleth object to include the following parameters. Replace ... here with the parameters you previously included.

choropleth = go.Choropleth(..., colorscale=color_scale, autocolorscale = False)

Run your program to see your new color scale. Feel free to play around with different color scales of your choosing. Remember Plotly recognizes any named HTML color

Task 6

A scatter and bubble map plot the locations of phenomena across a large region. A bubble map is a specialized scatter map, which sizes the markers according the range of the statistic presented

For the rest of this activity, we'll be creating a scatter and bubble map. Go ahead and comment out the line in main which calls your choropleth function. Create a new function in your program called create_scatter_bubble_map() and be sure to call it from your main function.

Download the following csv file to your computer: Airports in the US. In your new function load this csv file into a DataFrame object

Task 7

For either of these plots, we'll be using the Scattergeo object. This requires notably lat and lon parameters, as well as a mode to be markers (just like a scatter plot!) Include the following code in your program

scattergeo = go.Scattergeo(lat = data_frame['lat'], lon = data_frame['long'], mode='markers', locationmode='USA-states')

From your states_graph function, copy the layout, figure and plot lines. Be sure to change the title of your map to something meaningful for this data

Run your program to see your new graph

Task 8

Setting the color of the markers to something that corresponds to a column of data, for example the number of takeoff and landings at an airport, simply requires adding a marker parameter. This is exactly the same as when we wanted to color the markers of a scatter plot. For either maps or plots, we can set these markers to a range of numbers to automatically create a color scale.

Add the marker parameter as show in the code below. We'll set the color of each point to the 'cnt' data column from our DataFrame object.

scattergeo = go.Scattergeo(..., marker = {'color': data_frame['cnt']})

Run your program to observe your new color coded map

Task 9

Creating a bubble map is just a simple extension of the Scatter map: we just want to change the size of the markers according to a data column. However, using the column of data by itself is insufficient, we must scale it depending on the maximum and minimums of the data. In the code below, we divide the column by 100 to get a bubble size.

scattergeo = go.Scattergeo(..., marker = {'color': data_frame['cnt'], 'size':data_frame['cnt'] / 100})

Try this out, but notice that the bubble sizes aren't terribly great. Try creating your own conversion that works better. Typically log scales are best for data that spans several orders of magnitude. To apply a log scale to a dataframe, import the module numpy and use its log function.

Task 10

Lastly, we'll want to create meaningful hover text labels. Text labels follow the basics of HTML formatting. When using columns of data from the DataFrame objects though, we must convert each one to strings using the syntax shown below

labels = data_frame['airport'].astype(str) + "</br>Takeoffs/Landings: " + data_frame['cnt'].astype(str)

Adding text labels is very similar to adding labels to the basic charts like we did in the previous activity

scattergeo = go.Scattergeo(...,text = labels)

Test out your code and ensure your hover text labels now print the desired information. Feel free to add more information or styling to the labels

If you have extra time

Work through some of the other Plotly tutorials to get a hang of creating useful maps

Choropleth Maps

Bubble Maps

Scatter Maps

Lines on Maps

Other Plotly Map Examples

Once you're done, please check off your lab with a TA or share your file with by midnight, 4/18.