SAS Box Plot

作者: 不连续小姐 | 来源:发表于2019-05-04 10:10 被阅读0次

    SAS Day 33: Box Plot

    Definition:

    Box Plot or Whisker plot displays the distribution of 5-number summary of a dataset: minimum, maximum, q1, q3, and Median.

    Interpreting quartiles:

    The 5-number summary approximately divides the data into 4 sections that each containing 25% of the data.

    Explore a little more

    If we want to look at the Outliers, we define the points below q1- 1.5(q3-q1) and q3+ 1.5(q3-q1) as outliers.

    Note: if we transfer the Q1-Q3 range of a boxplot into a normal distribution, then it maps to the peak of a normal curve (± 0.6745σ).

    [caption id="attachment_2204" align="alignnone" width="750"] image

    akshayapatra / Pixabay[/caption]

    Example:

    we will use sashelp.class as an example for box-plot using SGPLOT and TEMPLATE, they both produce the same result!

    **Basic Box-Plot **

    image

    Interpretation:
    the median weight of female student is a little lower than 90, 25% of female students' weight are within 75- 82, 25% are within 105-115 and 50% are between 85-102.

    Code:

    SPGLOT

    proc sgplot data=sashelp.class;
    title "Distribution of Weight by Sex";
    vbox weight / category= sex;
    run;

    TEMPLATE

    proc template;
    define statgraph ClassBox;
    begingraph;
    entrytitle "Distribution of Weight by Sex";
    layout overlay;
    boxplot y=weight x=sex ;
    endlayout;
    endgraph;
    end;
    run;

    proc sort data=sashelp.class out=class;
    by sex;
    run;
    proc sgrender data=class template=ClassBox;
    run;

    Advance Box Plot:

    image

    Code:

    proc univariate data=sashelp.class;

    var weight ;
    class sex;
    ods output quantiles =q;
    run;

    data q2(rename=(estimate=weight) where=(Quantile ne " "));
    set q;
    quantile= scan(quantile, 2,"");
    run;

    proc template;
    define statgraph bpp;
    begingraph;
    entrytitle "Distribution of Weight by Sex" ;
    layout overlay;
    boxplotparm y=weight x=sex stat=quantile;
    endlayout;
    endgraph;
    end;
    run;

    proc sgrender data=q2 template=bpp;
    run;

    with the extra univariate step, we have a summary dataset to look for cross-validate the graph.
    we can see indeed the min of female students weight is 50.

    image

    Reference:

    https://www.khanacademy.org/math/statistics-probability/summarizing-quantitative-data/box-whisker-plots/a/box-plot-review

    https://towardsdatascience.com/understanding-boxplots-5e2df7bcbd51

    Creating Statistical Graphics in SAS,
    *Warren F.Kuhfeld *

    **Happy Practicing!

    相关文章

      网友评论

        本文标题:SAS Box Plot

        本文链接:https://www.haomeiwen.com/subject/pzxxoqtx.html