r - Can't drop column - select() with dplyr -
i'm using dplyr , have grouped data.frame. tried drop column select
function in grouped_df, got error message
> tbl %>% select(-names) error: corrupt 'grouped_df', contains 42 rows, , 965 rows in groups
my data below.
> print(tbl_df(tbl), n = 1000) source: local data frame [42 x 15] household names x2003 x2004 x2005 x2006 x2007 x2008 x2009 x2012 last.avail last.avail.year abschange.last annchange.last translation (chr) (fctr) (int) (int) (int) (int) (int) (int) (int) (int) (int) (dbl) (int) (dbl) (fctr) 1 households bostad 59280 61850 62760 63210 66950 73340 72350 77750 77750 2012 18470 0.030594980 accomodation 2 households fritid och kultur 45140 46140 49260 48640 49720 55120 53970 61170 61170 2012 16030 0.034341864 leisure , culture 3 households transport 41930 40430 45870 48850 47280 50250 42650 49940 49940 2012 8010 0.019614408 transportation 4 households köpta livsmedel 28420 30000 29130 30420 30750 34130 34780 34570 34570 2012 6150 0.022004509 bought groceries 5 households hyra/avgift för hyres-/borätt (inkl garage) 27310 27720 28860 30000 28990 29660 30740 na 30740 2009 3430 0.019914330 rent accomodation 6 households hushållstjänster 11360 12030 13200 12390 8520 10250 13530 22900 22900 2012 11540 0.081007165 household services 7 cohabit child bostad 78240 83040 81390 79180 90490 95630 100060 100980 100980 2012 22740 0.028754709 accomodation 8 cohabit child fritid och kultur 67110 67640 67290 64600 74290 71890 77200 81180 81180 2012 14070 0.021373640 leisure , culture 9 cohabit child transport 58350 62440 70010 69560 68730 75290 65510 71340 71340 2012 12990 0.022584342 transportation 10 cohabit child köpta livsmedel 45190 45660 45720 44980 48250 52880 52770 52710 52710 2012 7520 0.017250361 bought groceries 11 cohabit child hushållstjänster 19840 21380 25690 21430 17190 19060 24730 37440 37440 2012 17600 0.073108900 household services 12 cohabit child räntor (brutto) 27090 25230 24390 24500 28510 36030 33080 na 33080 2009 5990 0.033854485 rents (net) 13 cohabit without child bostad 60340 63230 63560 61760 67100 74160 70440 78510 78510 2012 18170 0.029679783 accomodation 14 cohabit without child fritid och kultur 51120 48780 57700 57320 57620 67220 62460 68400 68400 2012 17280 0.032884345 leisure , culture 15 cohabit without child transport 49740 46310 55580 57730 56770 54910 52720 59360 59360 2012 9620 0.019839931 transportation 16 cohabit without child köpta livsmedel 31130 33700 31900 33000 33990 37330 37980 37090 37090 2012 5960 0.019654591 bought groceries 17 cohabit without child drift av bil 24370 21790 25170 27530 25140 28180 26650 na 26650 2009 2280 0.015017696 car expenses 18 cohabit without child hushållstjänster 11650 12400 12260 12310 8580 11920 13950 26370 26370 2012 14720 0.095016005 household services 19 other cohabit child fritid och kultur 67680 75550 78020 75800 88870 80070 84490 116020 116020 2012 48340 0.061715253 leisure , culture 20 other cohabit child bostad 73850 68740 84800 86510 89290 106540 89650 100580 100580 2012 26730 0.034920030 accomodation 21 other cohabit child transport 66950 79620 75730 77800 81010 93790 77960 98660 98660 2012 31710 0.044022982 transportation 22 other cohabit child köpta livsmedel 54070 53790 50680 51440 53720 64170 62050 63690 63690 2012 9620 0.018360752 bought groceries 23 other cohabit child drift av bil 32690 34180 37530 36200 38280 38990 36390 na 36390 2009 3700 0.018031437 car expenses 24 other cohabit child hushållstjänster 15690 21000 20810 20370 9990 11880 19710 32460 32460 2012 16770 0.084128145 household services 25 other households bostad 62860 68680 69950 72840 70700 91510 84480 86020 86020 2012 23160 0.035466655 accomodation 26 other households fritid och kultur 49940 48530 55280 57970 54470 61130 65280 67920 67920 2012 17980 0.034758001 leisure , culture 27 other households transport 50590 41980 57370 64960 52780 61460 59770 59630 59630 2012 9040 0.018435074 transportation 28 other households köpta livsmedel 35370 35210 35360 41560 35040 43770 45940 43270 43270 2012 7900 0.022652258 bought groceries 29 other households drift av bil 21440 21580 25640 30070 28260 30070 32010 na 32010 2009 10570 0.069079862 car expenses 30 other households hyra/avgift för hyres-/borätt (inkl garage) 29550 32320 25170 24600 29480 35290 25920 na 25920 2009 -3630 -0.021607942 rent accomodation 31 single parent bostad 67890 67250 71200 75210 71000 73490 74710 81820 81820 2012 13930 0.020953501 accomodation 32 single parent fritid och kultur 34900 35860 43600 46770 43540 46160 45840 51000 51000 2012 16100 0.043049627 leisure , culture 33 single parent hyra/avgift för hyres-/borätt (inkl garage) 43360 44020 45160 49430 45370 44090 48740 na 48740 2009 5380 0.019685026 rent accomodation 34 single parent transport 27230 30810 28810 28410 30500 30390 29360 34890 34890 2012 7660 0.027925124 transportation 35 single parent köpta livsmedel 26420 27910 28160 29100 28310 33020 35910 33740 33740 2012 7320 0.027546212 bought groceries 36 single parent hushållstjänster 9490 11690 13770 8650 7250 10390 11490 17140 17140 2012 7650 0.067891620 household services 37 single parent without child bostad 45660 47110 48750 50850 51610 55720 56020 61090 61090 2012 15430 0.032876143 accomodation 38 single parent without child fritid och kultur 28270 31890 31140 30210 28480 35650 32840 41770 41770 2012 13500 0.044329701 leisure , culture 39 single parent without child hyra/avgift för hyres-/borätt (inkl garage) 31900 32160 33010 36300 34300 35330 37800 na 37800 2009 5900 0.028687635 rent accomodation 40 single parent without child transport 26730 22980 24530 29310 28440 31680 20150 28800 28800 2012 2070 0.008322088 transportation 41 single parent without child köpta livsmedel 15330 16930 16150 17630 17280 18390 19370 19580 19580 2012 4250 0.027561531 bought groceries 42 single parent without child hushållstjänster 6570 6590 6840 7080 3780 4300 7000 12310 12310 2012 5740 0.072257733 household services
what issue , how can resolved?
if variable drop used grouping variable, need ungroup
before using variable in select
. in current dplyr
version (dplyr_0.4.3
) case, may or may not change in future dplyr
versions
tbl %>% ungroup() %>% select(-names)
as example of corrupted grouped data, suppose if try remove column 'y' 'df3'
dat3 %>% select(-y) #error: corrupt 'grouped_df', contains 1100 rows, , 1000 rows in groups
by checking str(dat3)
str(dat3) #classes ‘grouped_df’, ‘tbl_df’, ‘tbl’ , 'data.frame': 1100 obs. of 2 variables: # $ group: factor w/ 3 levels "a","b","c": 2 3 2 2 2 2 1 2 2 1 ... # $ y : num 1.396 -0.892 1.065 0.801 -0.368 ... # - attr(*, "vars")=list of 1 # ..$ : symbol group # - attr(*, "drop")= logi true # - attr(*, "indices")=list of 3 # ..$ : int 6 9 12 13 14 16 18 21 25 27 ... # ..$ : int 0 2 3 4 5 7 8 10 11 15 ... # ..$ : int 1 17 24 28 35 37 39 43 47 49 ... # - attr(*, "group_sizes")= int 323 365 312 # - attr(*, "biggest_group_size")= int 365 # - attr(*, "labels")='data.frame': 3 obs. of 1 variable: # ..$ group: factor w/ 3 levels "a","b","c": 1 2 3 # ..- attr(*, "vars")=list of 1 # .. ..$ : symbol group # ..- attr(*, "drop")= logi true
we find attr
added rbind
ing, instead if use bind_rows
dat4 <- bind_rows(dat1, dat2) str(dat4) #classes ‘tbl_df’, ‘tbl’ , 'data.frame': 1100 obs. of 2 variables: # $ group: chr "b" "c" "b" "b" ... # $ y : num 1.396 -0.892 1.065 0.801 -0.368 ...
we can remove 'y' column 'dat4'
dat4 %>% select(-y)
as op didn't show how 'tbl' got created, can assume created using methods corrupted dataset adding attributes.
Comments
Post a Comment