geom_boxplot#

geom_boxplot(mapping=None, *, data=None, stat=None, position=None, show_legend=None, inherit_aes=None, manual_key=None, tooltips=None, orientation=None, fatten=None, outlier_alpha=None, outlier_color=None, outlier_fill=None, outlier_shape=None, outlier_size=None, outlier_stroke=None, varwidth=None, whisker_width=None, color_by=None, fill_by=None, **other_args)#

Display the distribution of data based on a five number summary (“minimum”, first quartile (Q1), median, third quartile (Q3), and “maximum”), and “outlying” points individually.

Parameters:
mappingFeatureSpec

Set of aesthetic mappings created by aes() function. Aesthetic mappings describe the way that variables in the data are mapped to plot “aesthetics”.

datadict or Pandas or Polars DataFrame

The data to be displayed in this layer. If None, the default, the data is inherited from the plot data as specified in the call to ggplot.

statstr, default=’boxplot’

The statistical transformation to use on the data for this layer, as a string.

positionstr or FeatureSpec, default=’dodge’

Position adjustment. Either a position adjustment name: ‘dodge’, ‘dodgev’, ‘jitter’, ‘nudge’, ‘jitterdodge’, ‘fill’, ‘stack’ or ‘identity’, or the result of calling a position adjustment function (e.g., position_dodge() etc.).

show_legendbool, default=True

False - do not show legend for this layer.

inherit_aesbool, default=True

False - do not combine the layer aesthetic mappings with the plot shared mappings.

manual_keystr or layer_key

The key to show in the manual legend. Specify text for the legend label or advanced settings using the layer_key() function.

tooltipslayer_tooltips

Result of the call to the layer_tooltips() function. Specify appearance, style and content. Set tooltips=’none’ to hide tooltips from the layer.

orientationstr

Specify the axis that the layer’s stat and geom should run along. The default value (None) automatically determines the orientation based on the aesthetic mapping. If the automatic detection doesn’t work, it can be set explicitly by specifying the ‘x’ or ‘y’ orientation.

fattenfloat, default=2.0

A multiplicative factor applied to size of the middle bar.

outlier_alphafloat

Default transparency aesthetic for outliers.

outlier_colorstr

Default color aesthetic for outliers.

outlier_fillstr

Default fill aesthetic for outliers.

outlier_shapeint

Default shape aesthetic for outliers, an integer from 0 to 25.

outlier_sizefloat

Default size aesthetic for outliers.

outlier_strokefloat

Default width of the border for outliers.

varwidthbool, default=False

If False, make a standard box plot. If True, boxes are drawn with widths proportional to the square-roots of the number of observations in the groups.

whisker_widthfloat, default=0.5

A multiplicative factor applied to the box width to draw horizontal segments on whiskers.

color_by{‘fill’, ‘color’, ‘paint_a’, ‘paint_b’, ‘paint_c’}, default=’color’

Define the color aesthetic for the geometry.

fill_by{‘fill’, ‘color’, ‘paint_a’, ‘paint_b’, ‘paint_c’}, default=’fill’

Define the fill aesthetic for the geometry.

other_args

Other arguments passed on to the layer. These are often aesthetics settings used to set an aesthetic to a fixed value, like color=’red’, fill=’blue’, size=3 or shape=21. They may also be parameters to the paired geom/stat.

Returns:
LayerSpec

Geom object specification.

Notes

Computed variables:

  • ..lower.. : lower hinge, 25% quantile.

  • ..middle.. : median, 50% quantile.

  • ..upper.. : upper hinge, 75% quantile.

  • ..ymin.. : lower whisker = smallest observation greater than or equal to lower hinge - 1.5 * IQR.

  • ..ymax.. : upper whisker = largest observation less than or equal to upper hinge + 1.5 * IQR.

geom_boxplot() understands the following aesthetics mappings:

  • x : x-axis coordinates.

  • lower : lower hinge.

  • middle : median.

  • upper : upper hinge.

  • ymin : lower whisker.

  • ymax : upper whisker.

  • alpha : transparency level of a layer. Accept values between 0 and 1.

  • color (colour) : color of the geometry lines. For more info see Color and Fill.

  • fill : fill color. For more info see Color and Fill.

  • size : lines width.

  • linetype : type of the line of border. Accept codes or names (0 = ‘blank’, 1 = ‘solid’, 2 = ‘dashed’, 3 = ‘dotted’, 4 = ‘dotdash’, 5 = ‘longdash’, 6 = ‘twodash’), a hex string (up to 8 digits for dash-gap lengths), or a list pattern [offset, [dash, gap, …]] / [dash, gap, …]. For more info see Line Types.

  • width : width of boxplot. Typically ranges between 0 and 1. Values that are greater than 1 lead to overlapping of the boxes.


To hide axis tooltips, set ‘blank’ or the result of element_blank() to the axis_tooltip or axis_tooltip_x parameter of the theme().

Examples

1import numpy as np
2from lets_plot import *
3LetsPlot.setup_html()
4n = 100
5np.random.seed(42)
6x = np.random.choice(['a', 'b', 'c'], size=n)
7y = np.random.normal(size=n)
8ggplot({'x': x, 'y': y}, aes(x='x', y='y')) + \
9    geom_boxplot()

 1import numpy as np
 2from lets_plot import *
 3LetsPlot.setup_html()
 4n = 100
 5np.random.seed(42)
 6x = np.random.normal(size=n)
 7y = np.random.choice(['a', 'b', 'b', 'c'], size=n)
 8ggplot({'x': x, 'y': y}, aes(x='x', y='y')) + \
 9    geom_boxplot(fatten=5, varwidth=True, \
10                 outlier_shape=8, outlier_size=2)

 1import numpy as np
 2import pandas as pd
 3from lets_plot import *
 4LetsPlot.setup_html()
 5n = 100
 6np.random.seed(42)
 7x = np.random.choice(['a', 'b', 'c'], size=n)
 8y = np.random.normal(size=n)
 9df = pd.DataFrame({'x': x, 'y': y})
10agg_df = df.groupby('x').agg({'y': [
11    'min', lambda s: np.quantile(s, 1/3),
12    'median', lambda s: np.quantile(s, 2/3), 'max'
13]}).reset_index()
14agg_df.columns = ['x', 'y0', 'y33', 'y50', 'y66', 'y100']
15ggplot(agg_df, aes(x='x')) + \
16    geom_boxplot(aes(ymin='y0', lower='y33', middle='y50', \
17                     upper='y66', ymax='y100'), stat='identity')

 1import numpy as np
 2import pandas as pd
 3from lets_plot import *
 4LetsPlot.setup_html()
 5n, m = 100, 5
 6np.random.seed(42)
 7df = pd.DataFrame({'y%s' % i: np.random.normal(size=n) \
 8                   for i in range(1, m + 1)})
 9ggplot(df.melt()) + \
10    geom_boxplot(aes(x='value', y='variable', color='variable', \
11                     fill='variable'), \
12                 outlier_shape=21, outlier_size=1.5, size=2, \
13                 alpha=.5, width=.5, show_legend=False)