How to Read a SAS Dataset Into R – The Right Way

|

The problem: while packages for reading in SAS datasets into R exist, they do not many formats, especially custom formats. Hence, a user must manually enter those in R. This becomes particularly onerous with survey datasets involving custom Likert scales.

Solution: SAS-R scripte. This handy script from an anonymous contributor generates R code to set the levels, labels and formatting of each variable.

Getting the Dataset into R

Two main options:

Haven package

read_sas(): reads .sas7bdat and .sas7bcat files from SAS
read_sav(): reads .sav files from SPSS
read_dta(): reads .dta files from Stata.

sas7bdat package

read.sas7bdat("psu97ai.sas7bdat")

Solution: sas-r
A Simple SAS Program

  1. Edit three items:
    Source Dataset Location (reads in the column names and assigned formats)
    Formats
    From a format library
    From a sas program
    Or paste the custom formats directly into the sas-r script
    Output R Program Name
  2. Click run in SAS
  3. Paste generated r code into your program and lean back in your chair
Note: we had to add the option (notsorted), so that R does not sort the formats alphabetically.

Example

SAS: value am_terrified_about_being_o_ 1='Always' 2='Usually' 3='Often’ 4='Sometimes' 5='Rarely' 6='Never’;
R: EAT26$am_terrified_about_being_o <- factor(EAT26$am_terrified_about_being_o, c(1, 2, 3, 4, 5, 6), exclude = "")
levels(EAT26$am_terrified_about_being_o) <- c("Always", "Usually", "Often", "Sometimes", "Rarely", "Never")

Final Tips/Questions?/Link
Add the notsorted option (otherwise factors will be sorted alphanumeric)

For ordinal factors, you must manually apply this to every variable:


ordered_vars <- c(34:63)
EAT26[ordered_vars] <- lapply(EAT26[ordered_vars], as.ordered)

Source: https://github.com/clindocu/sas-r