top of page

What sasdecoder can do

This page should help explain what sasdecoder can do. This is excerpted from the User Manual.

Uses and Capabilities of SASdecoder

SASdecoder is a utility that can translate certain forms of SAS code into Stata dictionaries, Stata do-files, and Stat/Transfer schema files. The scope of SAS statements it can accept includes limited features of INFILE, INPUT, LABEL, PROC FORMAT and several other related statements that are used in specifying how raw data files are to be read.


SASdecoder runs on PCs under Windows. It has been tested on Windows XP, Windows Vista, and Windows 7. Its Stata do-file and dictionary output can be used on any system that has Stata installed. Its Stat/Transfer schema output can be used on any system that has Stat/Transfer installed. Thus, the output can be used on a broader variety of machines than can run SASdecoder (e.g., Macintosh and Unix-based). That is, given a successful run of SASdecoder, the output can be copied to and used on a broad variety of machines.


SASdecoder was created in response to a situation that is common among some data analysts: You are given a raw data (text) file, along with SAS code (a “SAS System Program”) to read it into the SAS internal form, but you do not use SAS. It can provide a significant advantage when used on very large SAS source file – ones that are too large to translate clerically.

It is important to understand that SASdecoder does not read the data; it gives you the tools to read the data using Stata or Stat/Transfer. Furthermore, SASdecoder does not translate other SAS data-management operations such as MERGE, nor does it translate analysis procedures such as FREQ, UNIVARIATE or TABLE. (It does, however, translate VALUE statements to value label definitions.) Finally, it does not convert SAS data; users who need conversion of data should use a conversion facility such as Stat/Transfer (see www.circlesys.com).

It is important, also, to understand that, due to these limitations, SASdecoder cannot accept most SAS programs as given. You will need to edit the program down to the essential parts that are relevant for data-reading. See More on What SASdecoder Can Do for more on this topic.

Stata and Stat/Transfer Output

For Stata users, it can generate a Stata dictionary, and optionally, a corresponding do-file. You must have Stata software to be able to make use of these files. See www.stata.comfor more information.

For users of other data formats, it can generate a Stat/Transfer schema file, which can be used to convert the raw data into a multitude of formats (though the range of SAS input is somewhat restricted, as will be explained below). To use a Stat/Transfer schema file, you must have Stat/Transfer software, available from Circle Systems ( www.circlesys.com ), and your targeted data format must be among those generated by Stat/Transfer. Presently (in Version 10), Stat/Transfer supports these data formats; thus, with a valid schema file, the data can be read into any of these formats:

· 1-2-3

· Access (Windows version only)

· ASCII - Delimited

· ASCII- Fixed Format

· dBASE and compatible formats

· Epi Info

· Excel

· FoxPro

· Gauss

· HTML Tables (write only)

· JMP

· LIMDEP

· Matlab

· Mineset

· Minitab

· NLOGIT

· ODBC (Windows and Mac versions only)

· OSIRIS (read-only)

· Paradox

· Quattro Pro

· R

· SAS Data Files

· SAS Value Labels

· SAS CPORT (read-only)

SAS Transport Files

· S-PLUS

· SPSS Data Files

· SPSS Portable

· Stata

· Statistica (Windows version only)

· SYSTAT

· Triple-S

See www.circlesys.comfor more information.

A Synopsis of What SASdecoder Can Do

The SAS INPUT statement allows these forms of input:

· column input

· formatted input

· list input

· named input

SASdecoder can handle all but the named input form.

Examples:


1, column input[1]:


DATA;

INFILE "faminc01.dat";


INPUT

ID01 1-4 FIPS_STN 5-6

FAMINC01 9-15 TXHW01 16-22 TRHW01 23-29

TXOFM01 30-35;


LABEL

ID01="2001 INTERVIEW NUMBER"

FIPS_STN="FIPS STATE NUMERIC CODE"

FAMINC01="TOTAL FAMILY INCOME 2000"

TXHW01="TAXABLE INCOME HEAD AND WIFE 2000"

TRHW01="TRANSFER INCOME OF HEAD AND WIFE 2000"

TXOFM01="TAXABLE INCOME OTHER FAMILY MEMBERS";


This would be translated into this Stata Dictionary:


dictionary using faminc01.dat {

_column( 1) int id01 %4s "2001 INTERVIEW NUMBER"

_column( 5) byte fips_stn %2s "FIPS STATE NUMERIC CODE"

_column( 9) long faminc01 %7s "TOTAL FAMILY INCOME 2000"

_column( 16) long txhw01 %7s "TAXABLE INCOME HEAD AND WIFE 2000"

_column( 23) long trhw01 %7s "TRANSFER INCOME OF HEAD AND WIFE 2000"

_column( 30) long txofm01 %6s "TAXABLE INCOME OTHER FAMILY MEMBERS"

}


…or to this Stat/Transfer schema file:


file faminc01.dat


variables

id01 1-4 {2001 INTERVIEW NUMBER}

fips_stn 5-6 {FIPS STATE NUMERIC CODE}

faminc01 9-15 {TOTAL FAMILY INCOME 2000}

txhw01 16-22 {TAXABLE INCOME HEAD AND WIFE 2000}

trhw01 23-29 {TRANSFER INCOME OF HEAD AND WIFE 2000}

txofm01 30-35 {TAXABLE INCOME OTHER FAMILY MEMBERS}

2, Formatted input[2]:


INPUT

@1 CASEID $15.

@18 V000 $3.

@21 V001 8.0

@29 V002 4.0

@33 V003 3.0

;


This would get translated into these Stata Dictionary elements:


_column( 1) str15 caseid %15s

_column( 18) str3 v000 %3s

_column( 21) long v001 %8f

_column( 29) int v002 %4f

_column( 33) int v003 %3f


…or into these Stat/Transfer schema elements:


Variables

caseid 1-15 (A)

v000 18-20 (A)

v001 21-28

v002 29-32

v003 33-35

3, List input:


INPUT

name $ age earnings;


gets translated into these Stata Dictionary elements:


str8 name %s

float age %f

float earnings %f


…or to these Stat/Transfer elements:


name (A8)

age (F)

earnings (F)


In a typical SAS program that reads raw data, only one input form is used, but it is possible to have a mixture of the forms, in which case, the output, either a Stata dictionary or a Stat/Transfer schema, will be created with a corresponding mixture of specification types. But in the case of a StatTransfer schema, it will not be valid; a Stat/Transfer schema may contain list-input variables, provided that it is the only input type used – i.e., all variables are list-input. However, as noted, most real-world examples use only one input form, so this is not expected to be a serious limitation.


In typical use, the SAS code is a given entity – already written and tested by SAS programmers. It is not intended for you to write SAS code for SASdecoder to translate.[3]

More on What SASdecoder Can Do


It is important to understand that SASdecoder accepts only a limited set of features of a small subset of SAS statements. Furthermore, there are limits to its capability to produce Stata or Stat/Transfer files that emulate the behavior of SAS. Partly, the limitations correspond to the intrinsic capability of Stata and Stat/Transfer to emulate SAS features, regarding the reading of raw data.


SASdecoder is designed to accept those SAS features that specify the reading of raw data files, and only a certain subset of these features can be translated to either a Stata dictionary or do-file or a Stat/Transfer schema. Furthermore, not every translatable SAS feature has been accommodated to date, but much effort has been put into accommodating most of what you are likely to encounter.


Usually, a SAS program that was written for reading raw data cannot be used as-is by SASdecoder, but it still may be usable. You may need to edit- or comment-out certain parts of the file – features that SASdecoder doesn't understand, but which are not essential to the task of translation to Stata or Stat/Transfer. You may also need to rename some identifiers, such as those that are illegal as Stata variable names. It may take several iterations before this succeeds; you may want to use the “Dryrun” button or dryrun command (or omit the output file specifications) until you achieve a successful parsing of the SAS code. Also, you may want to do your editing in a separate copy of the SAS code file (saved under a distinct filename); you would do your editing in one file and keep the other in its original state for safekeeping and reference purposes.


Also, the output of SASdecoder may not always be precisely what you need, and you may want to adjust it as you see fit before using it. And it may need some tweaking to run satisfactorily. So you should consider it a starting point rather than a completed product.


Some SAS features cannot always be translated exactly into Stata or Stat/Transfer; some translate to code that is as close as possible, but may not produce exactly the same results. Sometimes this depends on the form of the raw data involved. Usually, warnings will be issued in these situations.

[1] This example is excerpted from the Panel Study of Income Dynamics Family Income-Plus supplement files: psidonline.isr.umich.edu .

[2] Excerpted from the zwir31rt.sas file from Demographic and Health Surveys; www.measuredhs.com

[3] While that is not the intent, it is certainly possible to do. This might be instructive to someone with SAS skills to learn about Stata dictionaries or Stat/Transfer schemas.

bottom of page