Question-41: What all are the restriction for creating libref?
Answer: It has following rules
- Can only be 1 to 8 character long.
- It can begin with the letter or underscore
- It can contain only letter, number and underscore.
Question-42: Can you give an example creating permanent dataset?
Answer: You can have Data step as below to create a permanent dataset
libname he_sas 'C:\Users\HadoopExam\he_sasdata';
data he_sas.he_learner
You can create permanent dataset named he_learner in the he_sas libref, which would be stored in the 'C:\Users\HadoopExam\he_sasdata' folder.
Question-43: Why it is said that Data step is a loop?
Answer: Data step is said as loop because it keeps iterating on all the records in the Raw data file until all the records are iterated to create new SAS dataset. Hence, if your raw file contains 1000 records then data step would be iterated 1000 times.
Question-44: What do you mean by SAS Dataset first phase is compile and what does it do in compile phase?
Answer: Whenever you submit a Data step for execution it would first check the syntax of the SAS statements and check whether these are correct or not and then compile it. Compiling the SAS statements means converting them into the machine instructions which computer processor can understand also known as generating machine code. And during the compile phase of the Data step SAS generate following three items.
- Input buffer
- Program data vector
- Descriptor information
Question-45: Can you please elaborate all thee items created in the compile phase and what is the use of them?
Answer: These all three things are container for the various data and would be used during the execution phase. Let’s discuss each of them
- Input buffer: Whenever SAS reads a record from the RAW data (during execution phase) it first place it into the input buffer. Input buffer is a logical area in memory (RAM). Input buffer at a time has only one record from raw file.
- Program data vector: This is also a logical area in the memory but SAS uses it to create a Data set. First raw data one record is read from a raw file and added in the input buffer as a next step it assigns variables to each data value and write entire single observation in the program data vector. Program data vector at a time keep only one observation.
In addition to the observation, program data vector also has two automatic variable created which are _N_ and _ERROR_
- Descriptor information: This holds the information about the entire dataset like all the variables information, how many observations in the data set etc.