• Sayali Deodikar

Data Visualization With Plotly

Updated: Nov 29, 2021

In this article, we will be covering one more python data visualization library called ‘Plotly’.


Plotly helps to create visualizations in an interactive manner. It is used to build a broad variety of charts ranging from basic charts to statistical charts to geographical maps to scientific charts and many more with minimum lines of code.


Some of the features of plotly -

  • Plotly has hover tool capabilities that help to find anomalies in the dataset

  • Has the capability to catch the audience as it is visually more attractive as compared to matplotlib

  • It allows customization to your graphs making them more meaningful

  • Plotly provides offline as well as online services. That means you can display your charts on the server and save them offline when needed

  • It is a free and open-source library that is accessible to everyone.


Online Services:

Plotly provides an online web service to create graphs. These graphs are saved in the user’s plotly account. For online graphs, retrieve personal API key using the link

Offline Services:

Plotly graphs can be created offline too . Graphs can be saved on local machines. Offline plots can be created with two options:

plotly.offline.plot() – to create graph as HTML. It can be opened through browser.

plotly.offline.iplot() – to create graph in Jupyter notebook.



Plotly.Express


When graphing with plotly, it is recommended to start with plotly express (px).

This module is a built-in part of the plotly library, which offers a high-level interface to create entire figures at once. Let’s start with installing and importing the library.

To install the plotly using pip:   pip install plotly
To install the plotly using conda:  conda install -c plotly plotly


As usual, firstly we need to import the module using the command

import plotly. express as px. 

px is an alias for plotly express. Now let’s see how to create plots.



1. Line Plot

A line plot connects the data points and describes quantitative values over a period of time. Consider the vaccination status of India according to the states. We will try to plot the total administered doses for different states using plotly. You can find the dataset here


data = pd.read_csv('cowin_vaccine_data_statewise.csv')
data.drop(data.index[(data["State"] == "India")],axis=0,inplace=True)
fig = px.line(data, x="Updated On", y="Total Covaxin Administered", color='State')
fig.show()




2. Pie Plot

A pie chart is a circular statistical chart, which is divided into sectors to illustrate numerical proportion. Here we are considering the dataset churn_modeling.csv which is available for free on Kaggle.



data = pd.read_csv('Churn_Modelling.csv')
customers = data['CustomerId'].groupby(data['Geography']).count()
fig = px.pie(customers,values = 'CustomerId', names=['France','Germany','Spain'],title = 'Percentage of total customers from differnet region')
fig.show()




3. Bar Graph

A bar graph is used to represent the relationship between numerical and categorical data.

Each entity of the categorical data is represented as a bar.


data = px.data.gapminder()
data_canada = data[data.country == 'Canada']
fig = px.bar(data_canada, x='year', y='pop',
             hover_data=['lifeExp', 'gdpPercap'], color='lifeExp',
             labels={'pop':'population of Canada'}, height=400)
fig.show()



4. Scatter Plot

A scatter plot is used for representing the relationship between two different numerical variables. Scatter plot uses dots to represent the data points.


df = px.data.iris()
fig = px.scatter(df, x="sepal_width", y="sepal_length", color="species")
fig.show()



5. Histogram

You can imagine a histogram as a connected bar graph. It is used to specify the values of variables over a user range.


data = px.data.iris()
hist_plot_by_species = px.histogram(data_frame= data, 
                                    nbins=30,
                                    x='sepal_length',
                                    color='species',
                                    title='Histogram of Sepal Length')
hist_plot_by_species.show()



7. Box Plot

A box plot is a standard way to represent the distribution of the data. Box plot can give you a good idea about the outliers of the data and variability or dispersion. Outliers are the values that are differing largely as compared to other data points in the dataset.


df = px.data.tips()
fig = px.box(df, x="day", y="total_bill", color="smoker")
fig.update_traces(quartilemethod="exclusive") # or "inclusive", or "linear" by default
fig.show()


8. Violin plot

A violin plot is used to plot the numeric data for one or more groups using density curves. It is helpful when you want to observe the relationship between multiple numeric data groups.


df = px.data.tips()
fig = px.violin(df, y="tip", x="smoker", color="sex", box=True, points="all",
          hover_data=df.columns)
fig.show()


9. Bubble plot

A bubble plot is a scatterplot where a third dimension is added: the value of an additional numeric variable is represented through the size of the dots.


df = px.data.gapminder()

fig = px.scatter(df.query("year==2007"), x="gdpPercap", y="lifeExp",
        size="pop", color="continent",
                 hover_name="country", log_x=True, size_max=60)
fig.show()


10. Heatmap

Heatmap is a technique used to visualize two-dimensional data. Heatmap uses colors to show the magnitude of the data. The variation in color may be by hue or intensity.


data = pd.read_csv('Churn_Modelling.csv')
data.drop(['RowNumber','Surname'],axis = 1,inplace = True)
data['IsBalance'] = data['Balance'].where(data['Balance'] == 0, 1)
data['Gender'].replace({'Male':0, 'Female':1}, inplace=True)
corr = data.corr()
fig = px.imshow(corr)
fig.show()



Plotly Graphics Objects


How do we add more customization to our graphs?


The plotly graph object is the python class that represents different parts of the figure. You can consider it similar to the object-oriented approach of matplotlib.

In fact, we can always create the same figure built by Plotly express using the graph objects directly.

The graph objects method requires much longer code compared to the 1-line Plotly express function. But Graph objects support higher-level convenience functions for making updates to already constructed figures and adding customization to them.


What is a figure?


The figure object is a data structure that holds information on what and how to display in the graphs. The plotly python package helps create, manipulate, and render this object as charts, plots, maps, etc.


The structure of the figure is described as




  • Data: Holds the data and types of charts. It is called ‘traces’ in terms of plotly language.

  • Layout: This specifies the options related to formatting the charts such as adding a title, setting x and y labels, changing colors and fonts, etc.


Creating a figure


You can build a complete figure by passing trace and layout specifications to the plotly.graph_objects.Figure constructor. These trace and layout specifications can be either dictionaries or graph objects.


import plotly.graph_objects as go
df = pd.DataFrame({
  "Fruit": ["Apples", "Oranges", "Bananas", "Apples", "Oranges", "Bananas"],
  "Contestant": ["Alex", "Alex", "Alex", "Jordan", "Jordan", "Jordan"],"Number Eaten": [2, 1, 3, 1, 3, 2],})
fig = go.Figure()
for contestant, group in df.groupby("Contestant"):
    fig.add_trace(go.Bar(x=group["Fruit"], y=group["Number Eaten"], name=contestant,
      hovertemplate="Contestant=%s<br>Fruit=%%{x}<br>Number Eaten=%%{y}<extra></extra>"% contestant))
fig.update_layout(legend_title_text = "Contestant")
fig.update_xaxes(title_text="Fruit")
fig.update_yaxes(title_text="Number Eaten")
fig.show()


The same figure can be created using plotly.express with fewer lines of code as shown below.


import plotly.express as px
fig = px.bar(df, x="Fruit", y="Number Eaten", color="Contestant", barmode="group")
fig.show()


Adding traces to plot


Consider the scatter plot created using plotly.express for the iris dataset. You can use add_trace method to add the trace.


df = px.data.iris()
fig = px.scatter(df, x="sepal_width", y="sepal_length", color="species",
                 title="Using The add_trace() method With A Plotly Express Figure")
fig.add_trace(
    go.Scatter(
        x=[2, 4],
        y=[4, 8],
        mode="lines",
        line=go.scatter.Line(color="gray"),
        showlegend=False)
)
fig.show()



Update Layout


You can update the background color using update_layout method.


fig.update_layout(plot_bgcolor='white', paper_bgcolor='LightSteelBlue')


Another useful customization is to add annotations at certain locations on the plot.




Conclusion:

In this article, we have seen how plotly can make our visuals customized and more interactive. Along with that, we have gone through the basic concepts of plotly.express and graphics objects. Use plotly.express to create the figures and customize them using object graphics. Happy Learning!!





33 views0 comments

Recent Posts

See All