Wednesday 14 December 2016

Select highest 5 and lowest 5 values

Recently, i was asked this question. The question was, there is a data set with variable age with all distinct values. what's required is the top 5 and bottom 5 values of AGE variable.

In the first step, i will sort the data set in ascending order.

     PROC SORT DATA=SASHELP.CLASS OUT=CLASS ;
         BY AGE;

     RUN;

Once sorted, i will use the below data step to get the top 5 and bottom 5 values from the data set.

     DATA TEST;
           IF 0 THEN SET CLASS NOBS=NOBS;
           PUT NOBS;
           DO I=1 TO 5, NOBS-4 TO NOBS;
           SET CLASS POINT=I;
           OUTPUT;
           END;
           STOP;

     RUN;


In the first, i will use the IF condition to get the number of observations. Since the variable has distinct values and sorted, i will use the POINT option to pick the observations 1 to 5 and 15 to 19 which will give me the top 5 and bottom 5 values of the variable.

reference: http://bharathtn.blogspot.in/2016/12/select-highest-5-and-lowest-5-values.html?view=flipcard

No comments:

Post a Comment