subset - Subsetting SPSS data imported into r with package haven? -
i've used package haven
read spss data r. seems ok, except when try subset data doesn't seem behave correctly. here's code (i don't have spss create example data , can't post real stuff):
require(haven) df <- read_spss("filename1.sav") tmp <- df[as_factor(df$variable1) == "factor1",] tmp <- tmp[!is.na(tmp$variable2), ]
the above df has "na" scattered throughout. expected above subset data, keeping rows variable1 "factor1" , discarding rows nas in variable2. first subset works expected. second subset not. removes rows, nas still present.
i suspect issue has way haven
structures imported data , uses class labelled
instead of actual factor variable, it's on head. know happening , how accomplish same?
here's structure of df
, variable1
, variable2
:
> str(df) 'data.frame': 4573 obs. of 316 variables: > str(df$variable1) class 'labelled' atomic [1:4573] 9 9 9 14 8 8 2 4 8 16 ... ..- attr(*, "labels")= named num [1:18] 1 2 3 4 5 6 7 8 9 10 ... .. ..- attr(*, "names")= chr [1:18] "factor1" "factor2" "factor3" "factor4" ... > str(df$variable2) class 'labelled' atomic [1:4573] 3 na 3 na 3 na 1 1 na na ... ..- attr(*, "labels")= named num [1:3] 1 2 3 .. ..- attr(*, "names")= chr [1:3] "sponsor" "not sponsor" "don't know"
Comments
Post a Comment