When it comes to visualizing data, most people have a straightforward idea about what to do. They use scatterplots to display the relationships between two variables. Boxplots are used to compare the dispersion of distinct elements in a variable. Pie charts can be used to portray how different classes contribute as a whole to the variable. Time series plot can be used to display the progress made over time by someone or an organization.
Apart from having a solid idea of what chart to use, it is important to utilize a software package to create graphs and develop charts and there are multiple resources out there that can be used to make this possible. ggplot2 via R, seaborn via python, Tableau, PowerBI, MS Excel are among some of the famous platforms used to build charts.
This article will be focusing on the process it takes to build charts on three packages/platforms: Tableau, seaborn and ggplot2. The dataset that was utilized is the widely-used iris dataset. The iris dataset has five variables. Four of them are continuous variables: petal length, petal width, sepal length and sepal width. The last one is a categorical variable called species. It has three classes: setosa, virginica and versicolor.
By building the same charts across all three platforms, one can compare the quality of the charts and decide which one to use when working on data visualization projects. The two charts that were generated are:
.A scatterplot that compares the relationship between sepal width and sepal length.
.A bar chart that compares average values of the four variables across the different species.
The iris dataset is ready-made on both R-studio and Jupyter Notebooks. Therefore, it was easily exported for use on Tableau.
Tableau is a platform that makes data visualization as easy as possible. Its huge advantage over python and R lies in the fact that it does not require code to load the dataset or to create graphs. Due to its drag and drop feature, it allows users to tinker around with the variables to build charts that effectively present information to its users. It also has other features that can be used to beautify charts and make them appealing to an audience.
Tableau’s easy-to-use ability can be witnessed in the video above. A book that can act as a guide for beginners on how to master the art of using Tableau is Ben Jones’ Communicating Data with Tableau: Designing, Developing, and Delivering Data Visualizations. Other charts that were built using Tableau can be viewed below.
ggplot2 is an amazing package that is provided by R-studio. Unlike Tableau, it requires its users to import a package to build charts. Although it requires some coding, the syntax for coding is quite straightforward. Building a simple chart with ggplot2 involves two easy steps.
The first step is to load the tidyverse package. The ggplot2 package is one of the many packages provided by the tidyverse package. By loading the tidyverse package, users would also have access to other package’s functionality while designing graphs. The code for loading tidyverse can be viewed below.
The second step is to use the coding syntax to generate a graph. The coding syntax can be seen below. ggplot() invokes the ggplot2 package and identifies the data to be used. geom_point() signifies that a scatterplot with points is the desired graph. By using aes() within the geom_point(), it was easy to map out what variables should appear on the x and y axis as well as group them according to their species. The labs() can be used to add title for the graph and label both the x and y axis. Setting the theme to classic using theme_classic() makes it possible for the user to control the theme setting.
If a user is interested in plotting a chart different from the one created above via ggplot2, this link can act as a guide for the user.
Seaborn is a package that is provided by python. It acts as an improvement to matplotlib, another data visualization package provided by python, to beautify graphs. Seaborn functions just like ggplot2 in the sense that it requires its users to load a package and uses a coding syntax to obtain the desired plot. Below is the code for loading the seaborn package and other useful packages that will make it easy to design the graph.
After loading the packages, the next step is to use the right functionalities to plot a chart. plt.figure() can be used to decide the size of the plot. sns.barplot() takes in the variables to be placed on the x and y axis as well as the dataset to be used. Like ggplot2, further changes to the appearance of the plot are made inside the sns.barplot() function. plt.title(), plt.xlabel() and plt.ylabel() are used to label the plot.
If a user is interested in plotting a chart different from the one above via seaborn, this link can act as a guide for the user.
All three platforms discussed above are amazing for designing and building graphs. Tableau is a great way for someone who is not interested in coding to easily generate charts. ggplot2 and seaborn are coding platforms that provide users with an open-ended approach to control the appearance of their graphs. When it comes to data visualization, your imagination is your only limit.