pyspark.pandas.DataFrame.boxplot#
- DataFrame.boxplot(**kwds)[source]#
Make a box plot of the Series columns.
- Parameters
- **kwdsoptional
Additional keyword arguments are documented in
pyspark.pandas.Series.plot()
.- precision: scalar, default = 0.01
This argument is used by pandas-on-Spark to compute approximate statistics for building a boxplot. Use smaller values to get more precise statistics (matplotlib-only).
- Returns
plotly.graph_objs.Figure
Return an custom object when
backend!=plotly
. Return an ndarray whensubplots=True
(matplotlib-only).
Notes
There are behavior differences between pandas-on-Spark and pandas.
pandas-on-Spark computes approximate statistics - expect differences between pandas and pandas-on-Spark boxplots, especially regarding 1st and 3rd quartiles.
The whis argument is only supported as a single number.
pandas-on-Spark doesn’t support the following argument(s) (matplotlib-only).
bootstrap argument is not supported
autorange argument is not supported
Examples
Draw a box plot from a DataFrame with four columns of randomly generated data.
For Series:
>>> data = np.random.randn(25, 4) >>> df = ps.DataFrame(data, columns=list('ABCD')) >>> df['A'].plot.box()
This is an unsupported function for DataFrame type