Base SAS - Logical Sub-setting - WHERE Condition
Let's use the following data, copy and
paste the below code to make the data (Insurance).
Data Insurance;
infile datalines ;
input Name $
Gender $ Year Claim_amount Policy_Type : $20. Sum_assured
Monthly_Income
;
Datalines;
VINOD M 2009 6069 ENDOWMENT
300000 21000
DEEPA F 2010 9911 TERM
100000 7000
DINESH M 2009 7147 TERM
400000 28000
MONAL F 2011 9418 TERM
200000 14000
NEETU F 2011 9635 ENDOWMENT
200000 14000
VINOD M 2010 6105 WHOLELIFE
100000 7000
TANGO M 2010 5914 TERM
400000 28000
RAJAT M 2011 7599 ENDOWMENT
300000 21000
NEETU F 2011 1466 TERM
200000 14000
MONAL F 2012 5265 TERM
200000 14000
BALJEET M 2010 8197 ENDOWMENT
500000 35000
SANTOSH M 2011 1078 ENDOWMENT
400000 28000
RAJAT M 2012 1006 WHOLELIFE
400000 28000
DINESH M 2010 3739 WHOLELIFE
500000 35000
DEEPA F 2011 3818 TERM
200000 14000
;
Run;
Now let's solve the following 9 queries:
Create the following data sets by using
the where clause :
1. List out the insured people who has
term policy
2. Insured getting salary in the range
10000-25000
3. People who claimed in year 2009 and
2010
4. two different dataset for male and
female.
5. Insured whose name starting with ‘S
‘ or ‘V’
6. Insured whose name ending with ‘H’
OR ‘T’
7. Insured whose name consist at least
one ‘N’
8. Insured whose name does not consist
‘N’
9. Insured who has term policy and
getting salary in range of 15000-30000 claimed in the year 2009.
Answers:
1. List out the insured people who has
term policy
data Insurance_1;
set insurance;
where policy_type = "TERM";
Run;
2. Insured getting salary in the range
10000-25000
data Insurance_2;
set insurance;
WHERE monthly_income between 10000 and 25000;
Run;
proc print;
run;
or
data Insurance_2;
set insurance;
WHERE monthly_income > 10000 and monthly_income < 25000;
Run;
proc print;
run;
3. People who claimed in year 2009 and
2010
data Insurance_3;
set insurance;
WHERE year in (2009,2010);
Run;
proc print;
run;
4. two different dataset for male and
female.
data female ;
set insurance ;
where gender='F';
run;
data male ;
set insurance ;
where gender=‘M';
run;
5. Insured whose name starting with ‘S
‘ or ‘V’
Data Insurance_5;
set insurance;
WHERE name like 'S%' or name like 'V%' ;
Run;
%
is called wild character, S% translates to a character with first letter as
"S" and then anything ...
S%T means that a character starting with S and ending with T. It can be of any length. There is one more wild card character _ , which means anything of 1 length only.
S%T means that a character starting with S and ending with T. It can be of any length. There is one more wild card character _ , which means anything of 1 length only.
6. Insured whose name ending with ‘H’
OR ‘T’
Data Insurance_6;
set insurance;
WHERE name like '%H' or name like '%T' ;
Run;
7. Insured whose name consist at least
one ‘N’
data Insurance_7;
set insurance;
WHERE name contains 'N' ;
Run;
8. Insured whose name does not consist
‘N’
data Insurance_8;
set insurance;
WHERE name not contains 'N' ;
Run;
9. Insured who has term policy and
getting salary in range of 15000-30000 claimed in the year 2009.
data high_income_9;
set insurance;
WHERE policy_type = 'TERM'
and (Monthly_income ge 15000 and monthly_income le 30000)
and year = 2009 ;
Run;
proc print;
run;
No comments:
Post a Comment