美文网首页SAS学习笔记
SAS编程实践---宏:按系统术语和首选术语分层次计算受试者发生

SAS编程实践---宏:按系统术语和首选术语分层次计算受试者发生

作者: RSP小白之路 | 来源:发表于2023-10-31 00:37 被阅读0次

写在前面。

在临床试验统计编程的工作中,像作者这样的菜鸟小白多数时候在做“填参工具人”,也就是,做好ADaM数据集,根据公司的公共宏程序说明填写参数,即可产生各种统计表。

菜鸟小白现在准备利用业余时间学习编写宏程序,公司公共宏的源代码我是接触不到的,所以准备根据自己工作中的理解,去编写复现具有同样功能的宏程序。

本文记录编写的宏(macro)的功能是:按系统术语首选术语分层次计算受试者发生不良事件/反应(AE)例数和例次;命名为AESOCPT

本文内容包括:

  • 目标表格拆分
  • 示例数据
  • 宏程序参数
  • 宏程序的结构
  • 宏程序的编写

1. 目标表格拆分

目标表格

上图是从adae数据集产生的根据系统术语首选术语统计受试者发生AE例数和例次统计表

  • 的维度来看,需要统计的数据包括:
    1. 所有受试者的发生的例数例次的合计;
    1. 根据系统术语,每种系统术语的例数例次的合计;
    1. 根据系统术语和首选术语,分别计算的例数例次。
  • 的维度来看,所呈现的信息是:
  • 首列是统计时分层次的变量;
  • 之后依次是各个试验分组的例数和例次;
  • 最后2列是所有组合计的例数和例次。

需要注意的是例数分组计算发生率,例次不计算发生率。

同时,同一受试者发生了同一SOC同一PT的AE多次,例数的计算中算作1次,例次可以算作多次。


2. 示例数据

该统计表一般用来统计adae的数据,另外还需要从adsl数据来获得受试者总人数各分组的人数

如下程序用来产生示例数据,方法很多不唯一。

%let seed1 = 111111111;

data adae;

do ii = 1 to 3;
armn = ii;
arm = cats("第",put(ii, best.) ,"组");
do jj = 1 to 100;
usubjid =  cats( "X",put(ii, best.) ,"-", put(jj, z3.));
   soc = cats(  "SOC",put(ranbin( &seed1., 5, 0.2) + 1, best.) );
   pt = cats(  "PT", compress(soc, , "kd")  , put(ranbin( &seed1., 10, 0.1) + 1, best.) );
   output;

end;
end;

run;


data adsl;

do ii = 1 to 3;
armn = ii;
arm = cats("第",put(ii, best.) ,"组");
do jj = 1 to 100;
usubjid =  cats( "X",put(ii, best.) ,"-", put(jj, z3.));
   output;
end;
end;
run;

3. 宏程序参数

SAS处理数据的载体是数据集,那么肯定需要输入数据集它所在的逻辑库,以及输出数据集它所在的逻辑库,我分别命名为,libindtinliboutdtout

而要统计人数,那么还需要adsl数据集和受试者编号USUBJID,以及分组变量grpvarn,注意,grpvarn数值型变量,需要根据分组信息进行转换限定;

最重要的是,既然是统计系统术语首选术语,那么自然还需要指定SOCPT变量;

最后,如果需要选择是否计算行合计或者列合计,可能还要设置变量rowsumyncolsumyn,它们限定的可选参数Y或者N

这个,我命名为AESOCPT

%AESOCPT(libin=work , 
    dtin = adae  ,
    adsl = adsl  , 
    usubjid = usubjid  ,
    l1var =soc ,
    l2var = pt,
    grpvarn = armn, 
    rowsumyn = Y, 
    colsumyn =Y ,
    libout =work ,
    dtout = table);

4. 宏程序的结构

我还是写宏程序的菜鸟,属于不断实践、探索和学习的过程。

目前我将宏程序的整体结构设计为如下几大步:

    *_1. pre-processing;
    *_2.main statistical  step;
    *_3 processing step of stat;
    * _4.output steps;
  • 第一步:预处理

在这一步,我主要会进行宏变量的处理和产生,以及输入数据集的处理,在这个宏的编写中,包括:

    *_1. pre-processing;
    *_1.1 macro variables;
    *subjid number;
    
    *_1.2 input datasets processing;
    *_1.2.1 for times of case;
    *_1.2.2 for number of case;
  • 受试者数量的宏变量的生成;
  • 用于例次计算的输入数据集处理
  • 用于例数计算的输入数据集处理
  • 第二步:主要统计步
    *_2.stat statistical step;
    *_2.1  number of case;
    *_2.2  times of case;
    *_2.3  caculation the sum for each row;
    *_2.4  caculation the sum for each column;

在核心的统计步中,拆分为如下的几个小步骤:

  • 例数计算
  • 例次计算
  • 计算每行的合计
  • 计算每列的合计
  • 第三步:统计后的处理
  • 第四步:输出步骤

5. 宏程序的编写

下面是我编写这个宏的全部代码的展示,菜鸟一枚,如有疏漏,还望见谅。

5.1 预处理

5.1.1 宏变量的赋值

首先,按照我编写宏程序的结构步骤,先进行总的受试者和分组受试者数量的宏变量的赋值。

    *_1. pre-processing;
    *_1.1 macro variables;
    *subjid number;
    proc sql noprint;
        select count(distinct  &grpvarn.) ,  count(distinct  &usubjid.)  into: grpnum, : SUBN999 from  &adsl.;
    quit;

    %put 受试者数量:&SUBN999.   分组数量:&grpnum.;

    %do xx = 1 %to &grpnum.;

        proc sql  noprint;
            select  count(distinct  &usubjid.)  into:SUBN&xx.  from  &adsl. where &grpvarn. = &xx.;
        quit;

        %put &grpvarn. = &xx.组的受试者数量: &&SUBN&xx.;
    %end;

5.1.2 输入数据集处理

    *_1.2 input datasets processing;
    data stdt0;
        set &libin..&dtin.;
    run;

    proc sort data=stdt0 out=socn_ nodupkey;
        by &l1var.;
    run;

    data &l1var.n;
        set socn_;
        &l1var.n = _N_;

    proc sort;
        by &l1var.;
    run;

    proc sort data=stdt0;
        by &l1var.;
    run;

5.1.2.1 用于例次计算数据集处理

    *_1.2.1 for times of case;
    data times1;
        merge stdt0
            &l1var.n;
        by &l1var.;
    run;

    data times2;
        set  times1;
        &l1var. = "合计";
        &l1var.n = 0;
        &l2var. = "合计";
    run;

5.1.2.2 用于例数计算数据集处理

    *_1.2.2 for number of case;
    data case1;
        merge stdt0
            &l1var.n;
        by &l1var.;
    run;

    proc sort data=case1 out=cs1nodup nodup dupout=cs1dup;
        by  &usubjid. &l1var.n  &l1var.  &l2var.;
    run;

    proc sort data=case1 out=cs2nodup nodup dupout=cs2dup;
        by  &usubjid. &l1var.n  &l1var.;
    run;

    data case2;
        set  cs2nodup;
        &l1var. = "合计";
        &l1var.n = 0;
        &l2var. = "合计";
    run;

5.2 主要统计步骤

5.2.1 例数和发生率的计算

    *_2.main statistical step;
    *_2.1  number of case;
    %do aa = 1 %to &grpnum.;

        proc sql noprint;
            create table ST_&aa. as

            select   &l1var.n, &l1var., "合计" as  &l2var., 
                cats(sum(&grpvarn. = &aa.), "(", put(sum(&grpvarn. = &aa.)/&&SUBN&aa.*100, 8.2), ")") as CASE_&aa.,
                sum(&grpvarn. > 0) + 0.2 as seq1
            from cs1nodup
                group by   &l1var.n, &l1var.

                    union 
                select &l1var.n, &l1var., &l2var., 
                    cats(sum(&grpvarn. = &aa.), "(", put(sum(&grpvarn. = &aa.)/&&SUBN&aa.*100, 8.2), ")") as CASE_&aa.,
                    sum(&grpvarn. > 0) +0.1  as seq1
                from cs1nodup
                    group by  &l1var.n, &l1var., &l2var.

                        union 
                    select   &l1var.n, &l1var., "合计" as &l2var., 
                        cats(sum(&grpvarn. = &aa.), "(", put(sum(&grpvarn. = &aa.)/&&SUBN&aa.*100, 8.2), ")") as CASE_&aa.,
                        sum(&grpvarn. > 0) + 1 as seq1
                    from case2
                        group by    &l1var.n, &l1var.
            ;
        quit;

        proc sort data=  ST_&aa.;
            by &l1var.n &l1var.  &l2var.;
        run;

    %end;

5.2.2 例次的计算

    *_2.2  times of case;
    %do aa = 1 %to &grpnum.;

        proc sql noprint;
            create table ST_&aa._ as

            select   &l1var.n, &l1var., "合计" as  &l2var., 
                cats(sum(&grpvarn. = &aa.)) as CASE_&aa._ ,
                sum(&grpvarn. > 0) + 0.2 as seq2
            from times1
                group by   &l1var.n, &l1var.

                    union 
                select &l1var.n, &l1var., &l2var., 
                    cats(sum(&grpvarn. = &aa.)) as CASE_&aa._ ,
                    sum(&grpvarn. > 0) +0.1  as seq2
                from times1
                    group by  &l1var.n, &l1var., &l2var.

                        union 
                    select   &l1var.n, &l1var., "合计" as &l2var., 
                        cats(sum(&grpvarn. = &aa.)) as CASE_&aa._ ,
                        sum(&grpvarn. > 0) + 1 as seq2
                    from times2
                        group by    &l1var.n, &l1var.
            ;
        quit;

        proc sort data=  ST_&aa._;
            by &l1var.n &l1var.  &l2var.;
        run;

    %end;

5.2.3 是否计算每行的合计

    %if %sysfunc(upcase(&rowsumyn.) ) = %str(Y) %then
        %do;
            %put WARNING:      已经计算每行合计;

5.2.3.1 计算每行的例数和发生率的合计

            *_2.3.1  caculation of each row for number of case;
            proc sql noprint;
                create table ST_99 as

                select   &l1var.n, &l1var., "合计" as  &l2var.,  1  as idid,
                    cats(sum(&grpvarn.  > 0), "(", put(sum(&grpvarn.  > 0)/&SUBN999.*100, 8.2), ")") as CASE_99,
                    sum(&grpvarn. > 0) + 0.2 as seq1
                from cs1nodup
                    group by   &l1var.n, &l1var.

                        union 
                    select   &l1var.n, &l1var.,   &l2var.,  2 as idid,
                        cats(sum(&grpvarn. > 0), "(", put(sum(&grpvarn. > 0)/&SUBN999.*100, 8.2), ")") as CASE_99,
                        sum(&grpvarn. > 0) + 0.1 as seq1
                    from  cs1nodup 
                        group by   &l1var.n, &l1var., &l2var.

                            union 
                        select   &l1var.n, &l1var., "合计" as &l2var.,   3 as idid,
                            cats(sum(&grpvarn. > 0), "(", put(sum(&grpvarn. > 0)/&SUBN999.*100, 8.2), ")") as CASE_99,
                            sum(&grpvarn. > 0) + 1 as seq1
                        from case2
                            group by    &l1var.n, &l1var.
                ;
            run;

            proc sort data=  ST_99;
                by &l1var.n  &l1var.  &l2var.;
            run;

5.2.3.2 计算每行的例次的合计

            *_2.3.2  caculation of each row for times of case;
            proc sql noprint;
                create table ST_99_ as

                select   &l1var.n, &l1var., "合计" as  &l2var.,  1  as idid,
                    cats(sum(&grpvarn.  > 0) ) as CASE_99_,
                    sum(&grpvarn. > 0) + 0.2 as seq2
                from times1
                    group by   &l1var.n, &l1var.

                        union 
                    select   &l1var.n, &l1var.,   &l2var.,  2 as idid,
                        cats(sum(&grpvarn.  > 0) ) as CASE_99_,
                        sum(&grpvarn. > 0) + 0.1 as seq2
                    from  times1 
                        group by   &l1var.n, &l1var., &l2var.

                            union 
                        select   &l1var.n, &l1var., "合计" as &l2var.,   3 as idid,
                            cats(sum(&grpvarn.  > 0) ) as CASE_99_,
                            sum(&grpvarn. > 0) + 1 as seq2
                        from times2
                            group by    &l1var.n, &l1var.
                ;
            run;

            proc sort data=  ST_99_;
                by &l1var.n  &l1var.  &l2var.;
            run;

        %end;

提醒不计算每行的合计:

    %else
        %do;
            %put WARNING:      不计算每行合计;
        %end;

5.2.4 是否计算每列的合计

    *_2.4   caculation the sum for each column;
    data  _0&dtout.;
        merge  ST_:
        ;
        by &l1var.n  &l1var.  &l2var.;

        %if %sysfunc(upcase(&colsumyn.) ) = %str(Y) %then
            %do;
                %put  WARNING:      已经计算每列合计;
            %end;
        %else
            %do;
                %put  WARNING:      不计算每列合计;

                if &l1var.n = 0 then
                    delete;
            %end;

5.3 统计后的处理步骤

    *_3 processing step of stat;

    proc sort;
        by &l1var.n descending seq1 descending seq2;
    run;

    run;

    data _1&dtout.;
        set  _0&dtout.;
        by &l1var.n descending seq1 descending seq2;

        if first.&l1var.n or first.&l1var. then
            &l1var. = &l1var.;
        else  &l1var. = "    "||&l2var.;
        keep &l1var.  CASE_:;
    run;

5.4 数据输出步骤

    * _4.output steps;
    proc contents data=  _1&dtout.  out= _1outs noprint;

    proc sort;
        by varnum;
    run;

    proc sql noprint;
        select count(distinct NAME) , NAME into:varn,:col1-:col99 from _1outs;
    quit;

    data &libout..&dtout.;
        set  _1&dtout.;

        %do ii = 1 %to &varn.;
            if &&col&ii. = "0(0.00)" then
                &&col&ii. ="0";
            %let jj = %eval(&ii. - 1);
            rename &&col&ii. = C&jj.;
        %end;
    run;

    proc datasets lib=work noprint;
        delete  soc: times: case: cs: st:  _:;
    run;

以上,如有疏漏,欢迎指正。

相关文章

网友评论

    本文标题:SAS编程实践---宏:按系统术语和首选术语分层次计算受试者发生

    本文链接:https://www.haomeiwen.com/subject/jxtiidtx.html