SAS - Read Raw Data



SAS can read data from various sources which includes many file formats. The file formats used in SAS environment is discussed below.

  • ASCII(Text) Data Set
  • Delimited Data
  • Excel Data
  • Hierarchical Data

Reading ASCII(Text) Data Set

These are the files which contain the data on text format. The data is usually delimited by a space, but there can be different types of delimiters also which SAS can handle. Let’s consider an ASCII file containing the employee data. We read this file using the Infile statement available in SAS.

Example

In the below example we read the data file named emp_data.txt from the local environment.

data TEMP; 
   infile 
   '/folders/myfolders/sasuser.v94/TutorialsPoint/emp_data.txt'; 
   input empID empName $ Salary Dept $ DOJ date9. ;
   format DOJ date9.;
   run;
   PROC PRINT DATA = TEMP;
RUN;

When the above code is executed, we get the following output.

read_raw_data1

Reading Delimited Data

These are the data files in which the column values are separated by a delimiting character like a comma or pipeline etc. In this case we use the dlm option in the infile statement.

Example

In the below example we read the data file named emp.csv from the local environment.

data TEMP; 
   infile 
   '/folders/myfolders/sasuser.v94/TutorialsPoint/emp.csv' dlm=","; 
   input empID empName $ Salary Dept $ DOJ date9. ;
   format DOJ date9.;
   run;
   PROC PRINT DATA = TEMP;
RUN;

When the above code is executed, we get the following output.

read_raw_data1

Reading Excel Data

SAS can directly read an excel file using the import facility. As seen in the chapter SAS data sets, it can handle a wide variety of file types including MS excel. Assuming the file emp.xls is available locally in the SAS environment.

Example

FILENAME REFFILE
"/folders/myfolders/TutorialsPoint/emp.xls"
TERMSTR = CR;

PROC IMPORT DATAFILE = REFFILE
DBMS = XLS
OUT = WORK.IMPORT;
GETNAMES = YES;
RUN;
PROC PRINT DATA = WORK.IMPORT RUN;

The above code reads the data from excel file and gives the same output as above two file types.

Reading Hierarchical Files

In these files the data is present in hierarchical format. For a given observation there is a header record below which many detail records are mentioned. The number of details records can vary from one observation to another. Below is an illustration of a hierarchical file.

In the below file the details of each employee under each department is listed. The first record is the header record mentioning the department and the next record few records starting with DTLS are the details record.

DEPT:IT 
DTLS:1:Rick:623 
DTLS:3:Mike:611 
DTLS:6:Tusar:578 
DEPT:OPS
DTLS:7:Pranab:632
DTLS:2:Dan:452
DEPT:HR
DTLS:4:Ryan:487
DTLS:2:Siyona:452

Example

To read the hierarchical file we use the below code in which we identify the header record with an IF clause and use a do loop to process the details record.

data employees(drop = Type);
   length Type $ 3  Department
      empID $ 3 empName $ 10 Empsal 3 ;
   retain Department;
   infile 
   '/folders/myfolders/TutorialsPoint/empdtls.txt' dlm = ':';
   input Type $ @;
   if Type = 'DEP' then 
      input Department $;
   else do;
      input empID  empName $ Empsal ;
      output;
   end;
run;

   PROC PRINT DATA = employees;
RUN;

When the above code is executed, we get the following output.

read_heirarchial_data2
Advertisements