Search
Ploticus >
Scripts >
proc boxplot renders boxplots also known as box-and-whisker plots in the
current plotting area.
By default it produces median-based boxplots which use median and interquartile range, but it can also produce
mean-based boxplots using mean and SD.
Recommended method is to use proc getdata to read in raw data, then use proc processdata (action: summaryplus) to
compute the necessary summary statistics, then invoke this proc to render boxplots.
Prior to version 2.40 this proc was called proc rangebar, and it rendered one boxplot (including computation of the
necessary statistics). In version 2.40 proc processdata's summaryplus action was implemented.
Correspondingly this proc's operation changed.. it now renders a series of boxplots and has a tighter coupling with proc processdata.
Some previously available functionality was removed out of necessity
or where deemed obscure, including options for display of outlier data points,
1.5 IQR tails, log transform, setting of variables such as RANGEBARMIN, and output of computed stats.
Attributes
Rangebars may be vertical (the default) or horizontal (see orientation below);
to make things easier to explain in this document, vertical boxplots are assumed.
basis
median | mean
If median (the default) median-based boxplots are produced (using median, 5th, 25th, 75th, and 95th percentiles).
If mean, then mean-based boxplots are produced (using mean, +/- sd, min, max ).
tailmode
5/95 | minmax
Used only with median-based boxplots.
Specifies whether the rangebar tails are to extend to the 5th and 95th percentile, to the min and max.
Default is 5/95.
Example: tailmode: minmax
locfield
dfield
The data field indicating boxplot location in X (or location in Y with orientation: horizontal).
If not given, boxplots will be rendered at X=1, X=2, X=3 .. X=N.
statfields
dfield
list
If you're using proc processdata (action: summaryplus) first, this attribute should not be set (it will be automatic).
If you computed statistics externally, you will need to provide a list of the data fields holding your statistics, using
the exact order shown here.
For median-based boxplots specify the fields for #observations, 5thpercentile, 25thpercentile, median, 75thpercentile, 95thpercentile, and
optionally the mean (if you're doing min/max tails then substitute min for 5thpercentile and max for 95thpercentile).
For mean-based boxplots specify the fields for #observations, mean, sd, min, max.
orientation
vertical | horizontal
Bar appearance details
barwidth
n
The width of the box portion of the rangebar in
absolute units.
Default is 0.12 inches.
Example: barwidth: 0.1
color
color
Specifies the color of the box area.
Example: color: yellow
mediansym
symboldetails
| dot | line
Specifies the symbol that will be displayed to show the median (or the mean with basis: mean).
May be a symbol specification (to get dots, etc.) or line which
is the default. dot or yes gives a small black dot.
Example: mediansym: shape=diamond
outline
yes | no
If yes, box is outlined with a line.
taildetails
linedetails
Controls color, linewidth, etc. of tail lines.
Example: taildetails: color=blue width=1.8
outlinedetails
linedetails
Controls color, linewidth, etc. of box outline and tics.
truncate
yes | no
If yes, bars are truncated to plotting area.
Default is yes.
ticlen
n
Length, in
absolute units
, of the tics which appear at the end of the tails.
Default is 70% of the width of the bar.
95tics
yes | no
On median-based boxplots using tailmode: minmax, this option allows display of 5th and 95th percentile by adding tics.
Example: 95tics: yes
meansym
yes | dot |
symboldetails
With median-based boxplots, this option causes a second data point symbol to be placed at the mean.
The result will be two symbols, one at the median, and another at the mean.
If yes or dot, a default symbol (a small black dot) will be placed at the mean.
Other symbols may be rendered by giving
other symboldetails specifications. If no, a mean symbol will not
be rendered. Default is no.
Example: meansym: shape=circle style=filled fillcolor=black radius=0.04
Selecting data rows
select
select expresion
Allows cases to be selected for inclusion using a selection expression.
Example: select: @2 = B
Legend
legendlabel
text
A label to be associated with the current set of bars, in a legend to be rendered later (proc legend).
The \\n construct can be used to force a line break,
or the label can be wordwrapped using proc legend wraplen attribute.
legendtype
color | symbol
Legend sample can be the boxplot color or the data point symbol used for the median or mean.
Default is color.
N= annotation
printn
yes | no
If yes, a label showing N (the number of observations) is produced. Default is yes.
Example: printn: no
nlocation
locvalue
Where to position the N label. The label will be aligned with the rangebar.
For vertical rangebars location indicates where to place the label
in Y; for horizontal, X.
Example: nlocation: -4
textdetails
textdetails
|