data analysis - Shorter method to replace entries in R -

- January 15, 2013

this question has answer here:

change values in multiple columns of dataframe using lookup table 2 answers

i have started learning r recently. here's source file working (https://github.com/cosname/art-r-translation/blob/master/data/grades.txt). there anyway can change letter grade from, say, 4.0, a- 3.7 etc. without using loop?

i asking because if there 1m entries, "for" loop might not efficient way modify data. appreciate help.

since 1 of posters told me post code, thought of running loop see whether able it. here's code:

mygrades<-read.table("grades.txt",header = true)  <- (i in 1:nrow(mygrades)) {   #print(i)     #for now, see whether replaced 4.0.   if(mygrades[i,1]=="a")   {     mygrades[i,1]=4.0   }   else if (mygrades[i,2]=="a")   {     mygrades[i,2]=4.0   }   else if (mygrades[i,3]=="a")   {     mygrades[i,3]=4.0   }   else   {     #do nothing...continues   }  }  write.table(mygrades,"newgrades.txt")

however, output little weird. "a"s, na , others left is. can please me code?

@alistaire, did try hadley's look-up table, , works. looked @ dplyr code, , works well. however, sake of understanding, i'm still trying use loops. please note has been 2 days since opened r book. here's modified code.

#there 1 mistake in code: didn't use stringsasfactors=false. #now, code doesn't work "a"s. spits out 4.0 as, , #doesn't others. why be?  mygrades<-read.table("grades.txt",header = true,stringsasfactors=false)  <- (i in 1:nrow(mygrades)) {   #print(i)     if(mygrades[i,1]=="a")   {     mygrades[i,1]=4.0   }   else if (mygrades[i,2]=="a")   {     mygrades[i,2]=4.0   }   else if (mygrades[i,3]=="a")   {     mygrades[i,3]=4.0   }   else   {     #do nothing...continues   }  }  write.table(mygrades,"newgrades.txt")

the output is:

"final_exam" "quiz_avg" "homework_avg" "1" "c" "4" "a" "2" "c-" "b-" "4" "3" "d+" "b+" "4" "4" "b+" "b+" "4" "5" "f" "b+" "4" "6" "b" "a-" "4" "7" "d+" "b+" "a-" "8" "d" "a-" "4" "9" "f" "b+" "4" "10" "4" "c-" "b+" "11" "a+" "4" "a" "12" "a-" "4" "a" "13" "b" "4" "a" "14" "d-" "a-" "4" "15" "a+" "4" "a" "16" "b" "a-" "4" "17" "f" "d" "a-" "18" "b" "4" "a" "19" "b" "b+" "4" "20" "a+" "a-" "4" "21" "4" "a" "a" "22" "b" "b+" "4" "23" "d" "b+" "4" "24" "a-" "a-" "4" "25" "f" "4" "a" "26" "b+" "b+" "4" "27" "a-" "b+" "4" "28" "a+" "4" "a" "29" "4" "a-" "a" "30" "a+" "a-" "4" "31" "4" "b+" "a-" "32" "b+" "b+" "4" "33" "c" "4" "a"

as can see in first row, first got recoded 4, second didn't recoded. idea why happening?

thanks in advance.

a typical way in base r make named vector lookup table, e.g.

# data fewer levels simplicity df <- data.frame(x = rep(1:3, 2), y = rep(1:2, 3))  lookup <- c(`1` = "a", `2` = "b", `3` = "c")

and subset each column:

data.frame(lapply(df, function(x){lookup[x]})) ##   x y ## 1 a ## 2 b b ## 3 c ## 4 b ## 5 b ## 6 c b

alternately, dplyr added recode function that's useful such job:

library(dplyr)  df <- read.table('https://raw.githubusercontent.com/cosname/art-r-translation/master/data/grades.txt', header = true)  df %>% mutate_all(funs(recode(., = '4.0',                                `a-` = '3.7'))) %>%    # etc.     as_data_frame()    # prettier printing  ## # tibble: 33 x 3 ##    final_exam quiz_avg homework_avg ##        <fctr>   <fctr>       <fctr> ## 1           c      4.0          4.0 ## 2          c-       b-          4.0 ## 3          d+       b+          4.0 ## 4          b+       b+          4.0 ## 5           f       b+          4.0 ## 6           b      3.7          4.0 ## 7          d+       b+          3.7 ## 8           d      3.7          4.0 ## 9           f       b+          4.0 ## 10         39       c-           b+ ## # ... 23 more rows

Search This Blog

Jal

data analysis - Shorter method to replace entries in R -

Comments

Post a Comment

Popular posts from this blog

javascript - Slick Slider width recalculation -

jsf - PrimeFaces Datatable - What is f:facet actually doing? -

angular2 services - Angular 2 RC 4 Http post not firing -