Data Visualisation: An attractive way to present data!

Data Visualisation: An attractive way to present data!

ยท

3 min read

Hello, Folks!

I am a Student who's practicing Data Science, Machine Learning, Deep Learning and Artificial Intelligence with Cloud Computing Platform.

In this article, we are going to discuss the Data Visualisation techniques and how to implement them with simple steps. I am going to demonstrate the basic Data Visualisation techniques.

What is Data Visualisation?

There are two definitions for that:

First, in simple words, it is a technique to represent the data in picturized form.

Second, in technical words, it is the Graphical Representation of information and data. By using visual elements like charts, graphs, and maps, data visualization tools provide an accessible way to see and understand trends, outliers, and patterns in data.

There are many libraries that are used in data visualization. If you are a beginner then start with the seaborn and matplotlib. I am going to use these two here.

SO, let's go through the requirements first

Download and install it with the commands provided

Libraries Used:

For Data Visualisation:

  1. SeaBorn

    pip install seaborn

  2. MatPlotLib

    pip install matplotlib

For Data Analysis and Manipulation:

Pandas

pip install pandas

Dataset:

Iris Dataset

Click on the dataset and download it. Keep it in the same folder which contains your .ipynb file.

Let's Start!

Import the required libraries and data set.

Then with the object_name.head() function, we can print the first five lines of the dataset to check whether it's imported properly or not.

import pandas as pd
import warnings
warnings.filterwarnings("ignore")
import seaborn as sns
import matplotlib.pyplot as plt
sns.set(style="white", color_codes=True)
iris = pd.read_csv("iris.csv")
iris.head()
Id SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm Species
0 1 5.1 3.5 1.4 0.2 Iris-setosa
1 2 4.9 3.0 1.4 0.2 Iris-setosa
2 3 4.7 3.2 1.3 0.2 Iris-setosa
3 4 4.6 3.1 1.5 0.2 Iris-setosa
4 5 5.0 3.6 1.4 0.2 Iris-setosa

Now, check the total number of values that are present there in the given dataset with the object_name.count() function.

iris["Species"].value_counts()

Output:

Iris-virginica     50
Iris-versicolor    50
Iris-setosa        50
Name: Species, dtype: int64

Now, let's plot the Scatter Plot with SepalLength on X_Axis and SepalWidth on Y_Axis.

iris.plot(kind="scatter", x="SepalLengthCm", y="SepalWidthCm")

Output:

*c* argument looks like a single numeric RGB or RGBA sequence, which should be avoided as value-mapping will have precedence in case its length matches with *x* & *y*.  Please use the *color* keyword-argument or provide a 2-D array with a single row if you intend to specify the same RGB or RGBA value for all points.





<AxesSubplot:xlabel='SepalLengthCm', ylabel='SepalWidthCm'>

output_2_2.png

sns.jointplot(x="SepalLengthCm", y="SepalWidthCm", data=iris, size=5)

Output:

<seaborn.axisgrid.JointGrid at 0x11a2a170>

output_3_1.png

sns.FacetGrid(iris, hue="Species", size=5) \
   .map(plt.scatter, "SepalLengthCm", "SepalWidthCm") \
   .add_legend()

Output:

<seaborn.axisgrid.FacetGrid at 0x11b8ced0>

output_4_1.png

Now, let's plot the Box Plot and Strip Plot with Species on X_Axis and PetalLength on Y_Axis.

sns.boxplot(x="Species", y="PetalLengthCm", data=iris)

Output:

<AxesSubplot:xlabel='Species', ylabel='PetalLengthCm'>

output_5_1.png

ax = sns.boxplot(x="Species", y="PetalLengthCm", data=iris)
ax = sns.stripplot(x="Species", y="PetalLengthCm", data=iris, jitter=True, edgecolor="gray")

Output:

output_6_0.png

Now, Let's plot the same data with Violin Plot.

sns.violinplot(x="Species", y="PetalLengthCm", data=iris, size=6)

Output:

<AxesSubplot:xlabel='Species', ylabel='PetalLengthCm'>

output_7_1.png

sns.FacetGrid(iris, hue="Species", size=6) \
   .map(sns.kdeplot, "PetalLengthCm") \
   .add_legend()

Output:

<seaborn.axisgrid.FacetGrid at 0x11c23130>

output_8_1.png

Now Pair Plot the data set.

sns.pairplot(iris.drop("Id", axis=1), hue="Species", size=3)

Output:

<seaborn.axisgrid.PairGrid at 0x11c17a90>

output_9_1.png

sns.pairplot(iris.drop("Id", axis=1), hue="Species", size=3, diag_kind="kde")

Output:

<seaborn.axisgrid.PairGrid at 0x124a8550>

output_10_1.png

iris.drop("Id", axis=1).boxplot(by="Species", figsize=(12, 6))
array([[<AxesSubplot:title={'center':'PetalLengthCm'}, xlabel='[Species]'>,
        <AxesSubplot:title={'center':'PetalWidthCm'}, xlabel='[Species]'>],
       [<AxesSubplot:title={'center':'SepalLengthCm'}, xlabel='[Species]'>,
        <AxesSubplot:title={'center':'SepalWidthCm'}, xlabel='[Species]'>]],
      dtype=object)

Output:

output_11_1.png

from pandas.plotting import andrews_curves
andrews_curves(iris.drop("Id", axis=1), "Species")

Output:

<AxesSubplot:>

output_12_1.png

from pandas.plotting import parallel_coordinates
parallel_coordinates(iris.drop("Id", axis=1), "Species")

Output:

<AxesSubplot:>

output_13_1.png

from pandas.plotting import radviz
radviz(iris.drop("Id", axis=1), "Species")

Output:

<AxesSubplot:>

output_14_1.png

Thank You!

Did you find this article valuable?

Support Vedant Pandya by becoming a sponsor. Any amount is appreciated!