Search


Ploticus > Scripts >
proc boxplot


proc boxplot renders boxplots also known as box-and-whisker plots in the current plotting area. By default it produces median-based boxplots which use median and interquartile range, but it can also produce mean-based boxplots using mean and SD.

Recommended method is to use proc getdata to read in raw data, then use proc processdata (action: summaryplus) to compute the necessary summary statistics, then invoke this proc to render boxplots.

Prior to version 2.40 this proc was called proc rangebar, and it rendered one boxplot (including computation of the necessary statistics). In version 2.40 proc processdata's summaryplus action was implemented. Correspondingly this proc's operation changed.. it now renders a series of boxplots and has a tighter coupling with proc processdata. Some previously available functionality was removed out of necessity or where deemed obscure, including options for display of outlier data points, 1.5 IQR tails, log transform, setting of variables such as RANGEBARMIN, and output of computed stats.




Attributes

Rangebars may be vertical (the default) or horizontal (see orientation below); to make things easier to explain in this document, vertical boxplots are assumed.    

basis     median | mean

    If median (the default) median-based boxplots are produced (using median, 5th, 25th, 75th, and 95th percentiles). If mean, then mean-based boxplots are produced (using mean, +/- sd, min, max ).

tailmode     5/95 | minmax

    Used only with median-based boxplots. Specifies whether the rangebar tails are to extend to the 5th and 95th percentile, to the min and max. Default is 5/95.
    Example: tailmode: minmax

locfield     dfield

    The data field indicating boxplot location in X (or location in Y with orientation: horizontal). If not given, boxplots will be rendered at X=1, X=2, X=3 .. X=N.

statfields     dfield list

    If you're using proc processdata (action: summaryplus) first, this attribute should not be set (it will be automatic). If you computed statistics externally, you will need to provide a list of the data fields holding your statistics, using the exact order shown here. For median-based boxplots specify the fields for #observations, 5thpercentile, 25thpercentile, median, 75thpercentile, 95thpercentile, and optionally the mean (if you're doing min/max tails then substitute min for 5thpercentile and max for 95thpercentile). For mean-based boxplots specify the fields for #observations, mean, sd, min, max.

orientation     vertical | horizontal



Bar appearance details

barwidth     n

    The width of the box portion of the rangebar in absolute units. Default is 0.12 inches.
    Example: barwidth: 0.1

color     color

    Specifies the color of the box area. Example: color: yellow

mediansym     symboldetails | dot | line

    Specifies the symbol that will be displayed to show the median (or the mean with basis: mean). May be a symbol specification (to get dots, etc.) or line which is the default. dot or yes gives a small black dot. Example: mediansym: shape=diamond

outline     yes | no

    If yes, box is outlined with a line.

taildetails     linedetails

    Controls color, linewidth, etc. of tail lines.
    Example: taildetails: color=blue width=1.8

outlinedetails     linedetails

    Controls color, linewidth, etc. of box outline and tics.

truncate     yes | no

    If yes, bars are truncated to plotting area. Default is yes.

ticlen     n

    Length, in absolute units , of the tics which appear at the end of the tails. Default is 70% of the width of the bar.

95tics     yes | no

    On median-based boxplots using tailmode: minmax, this option allows display of 5th and 95th percentile by adding tics. Example: 95tics: yes

meansym     yes | dot | symboldetails

    With median-based boxplots, this option causes a second data point symbol to be placed at the mean. The result will be two symbols, one at the median, and another at the mean. If yes or dot, a default symbol (a small black dot) will be placed at the mean. Other symbols may be rendered by giving other symboldetails specifications. If no, a mean symbol will not be rendered. Default is no.
    Example: meansym: shape=circle style=filled fillcolor=black radius=0.04




Selecting data rows

select     select expresion

    Allows cases to be selected for inclusion using a selection expression.
    Example: select: @2 = B




Legend

legendlabel     text

    A label to be associated with the current set of bars, in a legend to be rendered later (proc legend). The \\n construct can be used to force a line break, or the label can be wordwrapped using proc legend wraplen attribute.

legendtype     color | symbol

    Legend sample can be the boxplot color or the data point symbol used for the median or mean. Default is color.




N= annotation

printn     yes | no

    If yes, a label showing N (the number of observations) is produced. Default is yes. Example: printn: no

nlocation     locvalue

    Where to position the N label. The label will be aligned with the rangebar. For vertical rangebars location indicates where to place the label in Y; for horizontal, X. Example: nlocation: -4

textdetails     textdetails

    Details for the N= text.

 


Ploticus 2.42 ... May 2013 Terms of use /