Question 1:-
What is LRECL=
option?
SAS
Assumes that external files have record length of 256 or less. If your data
lines are longer than 256 characters then SAS will not read the data correctly.
To tell SAS the record length in the external files we use LRECL= Infile
option.
Ex: Infile 'c:\data\class.txt' LRECL=1500;
Question 2:-
What are the
limitations of List Input Style (Free Formatted Input Style)?
The limitations are as follows:
1. All the data in a record must be
read, no skipping of unwanted data
2. The data can’t have embedded
spaces.
3. The data that requires special treatment
(formats, length etc) can’t be read.
4. The character data of length
greater than four characters can’t be read.
5. Missing values should be
explicitly indicated.
Question 3:-
What are the
advantages of Column Input Style over List Input Style?
1.
The
Character data can have embedded blanks.
2.
Character
data of length greater than 8 characters can be read.
3.
Unwanted
data can be skipped.
4.
Missing
Values can be left blank.
Question 4:-
Observe the
following program. What is wrong with the program? The output will be as below.
Obs name
code year amt
1
Yellowstone
ID/MT/WY 1872 4065493
2
Everglades
FL 934 13
3
Yosemite CA 1864 7
4
Great Smoky Mountains
NC/TN 1926 520
5
Wolf Trap Farm
VA 1966 .
|
data abc;
input name$ 1-21 code$ year amt comma10.;
cards;
Yellowstone ID/MT/WY 1872 4,065,493
Everglades FL 1934 1,398,800
Yosemite CA 1864 760,917
Great Smoky Mountains NC/TN 1926 520,269
Wolf Trap Farm VA 1966 130
;
run;
proc print;
run;
|
When
SAS reads the data, it uses a pointer to mark its place. SAS uses this pointer
differently for different input style. For list input style, SAS automatically
scans to the next nonblank character and reads the data. For column input
style, SAS reads in the exact column you specify. For Formatted input style,
SAS starts reading wherever the pointer is. In the above example amt is read
with formatted input style. SAS starts reading right after year values columns
and reads 10 columns. Hence the output for this variable is not correct. We
have to use column pointers for this situation.
Question 5:-
What are / and #n?
/
and #n are line pointers. / tells SAS to go to the next dataline. #n tells SAs
to go the nth line within the observation. #n provides the flexibility of
moving forwards and backwards as well.
Question 6:-
What is the output
for the following program?
data abc;
input city$ State$
//
i1 i2
#2 i3 i4;
cards;
Nome AK
55 44
88 29
Miami FL
90 75
97 65
;
proc print;
run;
|
As #n
provides the flexibility of moving forward and backwards as well, SAS jumps to
3rd dataline and reads i1 and i2. Then SAS moves to the 2nd
column within observation and reads for i3 and i4.
So, I1
values will be 88 and 97, i2 values will be 29 and 65, i3 values will be 55 and
90, i4 values will be 44 and 75.
Question 7:-
What is the
difference between single trailing @ and double trailing @@?
Both are
line holding specifiers. The difference comes in how long they hold the line. Single
trailing @ releases the line when SAS starts developing a new observation or
SAS encounters an input statement without @.
Double
trailing @ releases the line at the end of last dataline or when SAS encounters
an input statement without @@.
Question 8:-
What is the
difference between Missover and Truncover?
Truncover
is necessary when you are using column or formatted input style and some
datalines are shorter than others.
Both
will assign missing values to variables if the data line ends before the
variable’s field starts. But when data line ends in the middle of the variable
field, Truncover will take as much data is available, whereas Missover assigns
a missing value.
Question 9:-
What is the purpose
of using DSD option?
When
your data is having embedded delimiters and two consecutive delimiters
indicates missing value then, in that case we have to use DSD option.
DSD does
three things:
1.
It
ignores the delimiters within the enclosed quotation marks.
2.
It
does not read quotations as a part of the data
3.
It
takes two consecutive delimiters as a missing value.
Question 10:-
Where will the
dataset gets stored in the following program?
data 'tree';
x=10;
y=20;
run;
|
In
directory based operating environments (windows for example), if you leave out
the path, then SAS uses the current working directory (You will see a path at
the bottom left corner in the SAS software). In the given program, SAS will
Store the dataset Tree in Current working Directory.
Question 11:-
Why SAS is called
Self Documenting Software?
Whenever
a data step gets executed, SAS automatically stores the descriptor portion of
the data set. The descriptor portion includes –
1.
Data
set name
2.
Number
of Observations
3.
Number
of variables
4.
Date
on which that data set is created
5.
Attributes
for each variable (Type, Length, Format, Informat, Label).
Question 12:-
How can see the
descriptor portion of a dataset?
By using
proc datasets or proc contents we can see the descriptor portion of the
dataset.
proc datasets lib=sashelp;
contents data=class
nods;
quit;
run;
|
proc contents data=sashelp.class
;
run;
|
NODS
is used to tell just list the datasets name in the library.
Question 13:-
What is the purpose
of using VARNUM in Proc contents?
Proc
Contents lists the variables in the alphabetical order. To tell SAS to list the
variables in the order of their occurrence in the datasets, VARNUM option is
used.
proc contents data=sashelp.class
VARNUM;
run;
|
Question 14:-
What are the naming
conventions of SAS names?
1.
The
names should start with a letter or underscore
2.
The
names should contain only letters, numbers, and underscore.
3.
The
names should be of length 32 characters or few.
Question 15:-
What are the
results of the following expressions? Why does the result differ?
Assignment
statement Result
x = 10 * 4 + 3 **
2;
x = 10 * (4 + 3 **
2);
SAS
follows standard mathematical rules of precedence, BODMAS. It calculates
Exponential first and then division & multiplication, then Addition and
Subtraction. Hence the results differ.
Question 16:-
Is it possible in
SAS to subset your data using IF-THEN statements and single trailing @?
Yes, it
is possible to subset the data using IF-THEN statement and single trailing @.
data
abc;
input a
@;
if
a=1;
input b
c;
cards;
1
2 3
1
2 3
4
5 6
7
8 9
1
5 9
1
6 7
;
proc print;
run;
|
Question 17:-
What is the
significance of January 1, 1960 in SAS?
SAS
has only two data types, Numeric and Character. SAS stores any date values as
the difference between January 1, 1960 and the given date, in Numeric format.
Question 18:-
What is the
significance of the year 1920 in SAS?
SAS
takes the two digit year value on the basis of YEARCUTOFF option. The default
value of YEARCUTOFF option is 1920. This value can be changed. SAS looks
at the 100 year block starting from 1920
(1920 to 2019) and takes the two digit year value accordingly.
options yearcutoff=1950;
Question 19:-
What is the result
of the following data step?
data abc;
set sashelp.class;
array numeric(4);
run;
proc print;
run;
|
Array
concept in SAS is used for grouping the similar type of variables and creating
Similar type of variables also. In the given program, SAS creates 4 numeric
variables with names as Numeric1, Numeric2, Numeric3, Numeric4 with default
length of 8 bytes. It assigns missing values to all these variables.
Question 20:-
What will be there
in B variable if you execute the following functions?
B=index('I AM THERE AS THE KING', 'Her');
B=indexw('I AM THERE AS THE KING', 'Her');
Index function returns 7 whereas Indexw returns
zero.
The main difference between Index and Indexw
functions is that Indexw has to be given with the whole word. There is no word
HER in the given sentence and hence it is giving zero.
Question 21:-
Write the result
for the following statement.
B2=CATX('A','B','C','D');
CATX function considers the first argument as
separator. It concatenates B, C, D with A inbetween. So the result will be
BACAD.
Question 22:-
What is the result
for the following program?
Data four;
x=200;
If x le 100 then grade='low';
else x ge 200 then grade='high';
else grade='medium';
Run;
|
During the compilation, SAS will create the variables and
assigns attributes also. Since Grade appears with value LOW for the first time,
SAS assigns the length of grade as 3. Hence X will be having 200 and grade will
be having HIG.
Question 23:-
How many times
“Better Code” will be printed with the following program?
Data five;
Do I=1 to 5;
Do
J=5 to 10;
Put 'BETTER
CODE';
End;
End;
Run;
|
For each value of I, “Better Code” will be printed for 6
times. Hence total of 30 times it will be printed.
Question 24:-
What is the default value for options Linesize and Pagesize?
Linesize=102
Pagesize=56
Question 25:-
If you donot want
to produce any dataset, how would you code the data statement to prevent SAS
from producing dataset?
By writing data _null_;
we can instruct SAS not produce any data sets.
Question 26:-
What is the output
for the following statement?
K=translate('(080)-888 544 54','[]','()');
Translate function in the given statement replaces all ( with
[ and all ) with ]. Hence the output will be as follows:
[080]-888 544 54
Question 27:-
Can we use _N_ in
Where statement?
Where statement does not use PDV and hence any new variable
created in the data step using assignment statement, any temporary variables
created during the execution of the data step cannot be used in Where
statement. Where statement cannot be used when you are reading external raw
data or internal raw data. Where statement works for already existing datasets
only.
Question 28:-
What are _N_ and
_ERROR_?
They are temporary variables that are created during the data
step execution time. _N_ represents the number of times that data step has
looped and _Error_ will have the value 1 when there is a data error for that
observation.
Question 29:-
What is the
difference between the following statements?
input
C1-C4;
input C1
a b d C4;
put
C1--C4;
First one is Numbered range list. Here SAS creates variables
C1, C2, C3, C4 variables.
Second one is Name range list. Here SAS will print all the
variables from C1 to C4 i.e. C1 a b d C4.
Question 30:-
What is the
difference between PUT function and PUT statement?
PUT statement is used to print the values or text into the
required destination. PUT function is used to convert Numeric data into
Character data.
No comments:
Post a Comment