Python Seaborn Tutorial

Seaborn Histogram using sns.distplot() – Python Seaborn Tutorial

If you have numeric type dataset and want to visualize in histogram then the seaborn histogram will help you. For this seaborn distplot function responsible to plot it.

In previous seaborn line plot blog learn, how to find a relationship between two dataset variables using sns.lineplot() function. Also, you are thinking about plot histogram using seaborn distplot because matplotlib plt.hist() work for the same. right?

Don’t worry, depending on your requirement and which one is easy for you, choose it.

Plotting seaborn histogram using seaborn distplot function

Here, we are using ‘tips’ DataFrame plot sns histogram. So let’s start practical without wasting time.

Import Libraries

#Import libraries
import seaborn as sns # For Data Visualization
from scipy.stats import norm # for scientific Computing
import matplotlib.pyplot as plt # For Data Visualization

Load DataFrame from GitHub

To load the dataset from GitHub seaborn repository use sns.load_dataset() function.

#Load "tips" DataFrame from GitHub seaborn repository
tips_df = sns.load_dataset("tips")
tips_df

Output >>>

pandas tips DataFrame - seaborn tutorial

Observe above tips DataFrame (tips_df ), Which contain three numeric type column like ‘tips_bill’, ‘tip’ and ‘size’. So, we can plot a histogram for them.

Plot tips_df[“size”] Histogram

#Plot Histogram of "size"
sns.distplot(tips_df["size"])

Output >>>

seaborn histogram

Plot tips_df[“tip”] Histogram

#Plot Histogram of "tip"
sns.distplot(tips_df["tip"])

Output >>>

seaborn distplot

Plot tips_df[“total_bill”] Histogram

#Plot Histogram of "total_bill"
sns.distplot(tips_df["total_bill"])

Output >>>

sns histogram

How to modify the seaborn histogram?

Seaborn distplot function has a bunch of parameters, which help to decorate sns histogram.

Syntax: sns.distplot(
                                     a,
                                     bins=None,
                                     hist=True,
                                     kde=True,
                                     rug=False,
                                     fit=None,
                                     hist_kws=None,
                                     kde_kws=None,
                                     rug_kws=None,
                                     fit_kws=None,
                                     color=None,
                                     vertical=False,
                                     norm_hist=False,
                                     axlabel=None,
                                     label=None,
                                     ax=None,
                                    )

  • a: Pass numeric type data as a Series, 1d-array, or list to plot histogram. Examples showed above.
  • bins: If, the dataset contains data from range 1 to 55 and your requirement to show data step of 5 in each bar.
#Plot Histogram of "total_bill" with bins parameters
sns.distplot(tips_df["total_bill"], bins=55)

Output >>>

seaborn histogram bins
  • hist: If, you don’t need histogram then pass bool “True” value otherwise “False“.
#Plot Histogram of "total_bill" with hist parameters
sns.distplot(tips_df["total_bill"], hist = False)

Output >>>

seaborn distplot hist
  • kde: ked stands for “kernel density estimate” to show it pass bool value “True” or “False“.
#Plot Histogram of "total_bill" with kde (kernal density estimator) parameters
sns.distplot(tips_df["total_bill"], kde=False,)

Output >>>

seaborn distplot kde
  • rug: To show rug plot pass bool value “True” otherwise “False“.
#Plot Histogram of "total_bill" with rugplot parameters
sns.distplot(tips_df["total_bill"],rug=True,)

Output >>>

seaborn distplot  rug plot
  • fit: Fit the normalize, pass value norm and kde value “False” along with that import (from scipy.stats import norm).
#Plot Histogram of "total_bill" with fit and kde parameters
sns.distplot(tips_df["total_bill"],fit=norm, kde = False) # for fit (prm) -  from scipi.stats import norm

Output >>>

seaborn distplot fit
  • color: To give color for sns histogram, pass a value in as a string in hex or color code or name.
#Plot Histogram of "total_bill" with color parameters
sns.distplot(tips_df["total_bill"],color="r",) # pass red color

Output >>>

seaborn distplot color
  • vertical: To show histogram vertical pass bool value “False” and horizontal “True”.
#Plot Histogram of "total_bill" with vertical parameters
sns.distplot(tips_df["total_bill"],vertical=True,) # Horizontal hist

Output >>>

seaborn distplot vertical
  • norm_hist: The histogram height shows a density rather than a count if pass bool value “True” otherwise “False”
#Plot Histogram of "total_bill" with norm_hist parameters
sns.distplot(tips_df["total_bill"],norm_hist=True,)

Output >>>

seaborn distplot norm_hist
  • axlabel: Give a name to the x-axis
#Plot Histogram of "total_bill" with axlabel parameters
sns.distplot(tips_df["total_bill"],axlabel="Total Bill",)

Output >>>

seaborn distplot axlabel
  • label: Give a label to the sns histogram. It doesn’t work without matplotlib.pytplot’s plt.legend() function.
#Plot Histogram of "total_bill" with label parameters
sns.distplot(tips_df["total_bill"],label="Total Bill",)

plt.title("Histogram of Total Bill") # for histogram title
plt.legend() # for label

Output >>>

seaborn distplot label

Seaborn distplot Set style and increase figure size

To increase histogram size use plt.figure() function and for style use sns.set().

# Plot histogram in prper format
plt.figure(figsize=(16,9)) # figure ration 16:9
sns.set() # for style

sns.distplot(tips_df["total_bill"],label="Total Bill",)

plt.title("Histogram of Total Bill") # for histogram title
plt.legend() # for label

Output >>>

distplot set style and figure size

Seaborn distplot bins

The distplot bins parameter show bunch of data value in each bar and you want to modify your way then use plt.xticks() function.

First, observing total_bill dataset from tips.

tips_df.total_bill.sort_values() # to know norder of values

Output >>>

67      3.07
92      5.75
111     7.25
172     7.25
149     7.51
195     7.56
218     7.74
145     8.35
135     8.51
126     8.52
222     8.58
6       8.77
30      9.55
178     9.60
43      9.68
148     9.78
53      9.94
235    10.07
82     10.07
226    10.09
10     10.27
51     10.29
16     10.33
136    10.33
1      10.34
196    10.34
75     10.51
168    10.59
169    10.63
117    10.65
       ...  
44     30.40
187    30.46
39     31.27
167    31.71
173    31.85
47     32.40
83     32.68
237    32.83
175    32.90
141    34.30
179    34.63
180    34.65
52     34.81
85     34.83
11     35.26
238    35.83
56     38.01
112    38.07
207    38.73
23     39.42
95     40.17
184    40.55
142    41.19
197    43.11
102    44.30
182    45.35
156    48.17
59     48.27
212    48.33
170    50.81
Name: total_bill, Length: 244, dtype: float64

In the above dataset, min value 3.07 and max value 50.81. So, we can easily create a range from 1 to 55 with 5 intervals for bins and plot sns histogram.

# Modify histogram with bins  
bins = [1,5,10,15,20,25,30,35,40,45,50,55] # list

plt.figure(figsize=(16,9))
sns.set()

sns.distplot(tips_df["total_bill"], bins = bins)

plt.xticks(bins) # set bins value

plt.title("Histogram of Total Bill") 
plt.show()

Output >>>

distplot bins

sns distplot histogram keyword arguments

sns.distplot() function allow keyword arguments (kws) to plot histogram beautiful way.

Here, we change color, edge color, line width, line style, and alpha of histogram.

plt.figure(figsize=(16,9))
sns.set()

# hist keyword argument to change hist format
sns.distplot(tips_df["total_bill"],
            hist_kws = {'color':'#DC143C', 'edgecolor':'#aaff00',
                       'linewidth':5, 'linestyle':'--', 'alpha':0.9}) # hist keyword parameter to change hist format

Output >>>

distplot hist kws

sns distplot kde(kernel density estimate) keyword arguments

kde (kernel density estimate) also support kws. So, we change color, line width, line style and alpha of distplot kde.

plt.figure(figsize=(16,9))
sns.set()

# hist, kde and rug keyword  argument to change hist format
sns.distplot(tips_df["total_bill"],
            hist_kws = {'color':'#DC143C', 'edgecolor':'#aaff00',
                       'linewidth':5, 'linestyle':'--', 'alpha':0.9},
            
            kde_kws = {'color':'#8e00ce', 
                       'linewidth':8, 'linestyle':'--', 'alpha':0.9},
)

Output >>>

 distplot kde kws

sns distplot rugplot keyword arguments

When you want to use rugplot then pass True value to a distplot rug parameter and give kws like color, edge color, line width, line style, and alpha.

plt.figure(figsize=(16,9))
sns.set()

# hist, kde and rug keyword  argument to change hist format
sns.distplot(tips_df["total_bill"],
            hist_kws = {'color':'#DC143C', 'edgecolor':'#aaff00',
                       'linewidth':5, 'linestyle':'--', 'alpha':0.9},
            
            kde_kws = {'color':'#8e00ce', 
                       'linewidth':8, 'linestyle':'--', 'alpha':0.9},
            rug = True,
            rug_kws = {'color':'#0426d0', 'edgecolor':'#00dbff',
                       'linewidth':3, 'linestyle':'--', 'alpha':0.9},)

Output >>>

distplot rug kws

sns distplot fit keyword arguments

To fit the curve in histogram then give some value to distplot fit parameter like the norm and kws like color, line width, line style, and alpha. For better representation give False value to kde.

plt.figure(figsize=(16,9))
sns.set()

# hist, fit and rug keyword  argument to change hist format
sns.distplot(tips_df["total_bill"],
            hist_kws = {'color':'#DC143C', 'edgecolor':'#aaff00',
                       'linewidth':5, 'linestyle':'--', 'alpha':0.9},
            
            kde=False,
            fit = norm,
            fit_kws = {'color':'#8e00ce', 
                       'linewidth':12, 'linestyle':'--', 'alpha':0.4},
            rug = True,
            rug_kws = {'color':'#0426d0', 'edgecolor':'#00dbff',
                       'linewidth':5, 'linestyle':'--', 'alpha':0.9},)

Output >>>

distplot fit kws

Best way to plot a seaborn histogram

Above, we learn how to use different parameters, functions and keyword arguments. Now, its time to use at one place and you can also follow it in your projects.

Example:

#Plot histogram in best format
plt.figure(figsize=(16,9))
sns.set()

bins = [1,5,10,15,20,25,30,35,40,45,50,55]
sns.distplot(tips_df["total_bill"],bins=bins,
            hist_kws = {'color':'#DC143C', 'edgecolor':'#aaff00',
                       'linewidth':5, 'linestyle':'--', 'alpha':0.9},
            
            kde=False,
            fit = norm,
            fit_kws = {'color':'#8e00ce', 
                       'linewidth':12, 'linestyle':'--', 'alpha':0.4},
            rug = True,
            rug_kws = {'color':'#0426d0', 'edgecolor':'#00dbff',
                       'linewidth':3, 'linestyle':'--', 'alpha':0.9},
            label = "TB")

plt.xticks(bins)
plt.title("Histogram of Restorant Total Bill", fontsize = 20)
plt.xlabel("Total Bill", fontsize = 15)
plt.legend()
plt.show()

Output >>>

best seaborn histogram

How to plot multiple seaborn histograms using sns.distplot() function

Till now, we learn how to plot histogram but you can plot multiple histograms using sns.distplot() function.

In bellow code, used sns.distplot() function three times to plot three histograms in a simple format. Homework for you, to modify it and share your code in the comment box.

# Plot multiple seaborn histogram in single graph
plt.figure(figsize=(16,9))
sns.distplot(tips_df["total_bill"], bins=9, label="total_bil")
sns.distplot(tips_df["tip"], bins=9, label="tip")
sns.distplot(tips_df["size"], bins=9, label = "size")

plt.legend()

Output >>>

multiple seaborn histograms

Conclusion

In the seaborn histogram blog, we learn how to plot one and multiple histograms with a real-time example using sns.distplot() function. Along with that used different function with different parameter and keyword arguments. We Suggest you make your hand dirty with each and every parameter of the above methods. This is the best coding practice. Still, you didn’t complete the matplotlib tutorial jump on it.

Download practical code snippet in Jupyter Notebook file format

Leave a Reply