python - Pandas: expand list in column to distinct rows -
i have dataset big number of columns contain several values (imported google forms, columns allowing multiple selection). i've imported lists initially.
now want analyse data based on values columns, i.e. given
df = pd.dataframe(dict(a=[(1,2),(2,3),(1,)], b=[(1,3),(2,5),], c=['a','b','c'])) b c 0 (1, 2) (1, 3) 1 (2, 3) (2, 5) b 2 (1) () c
i want plot bar chart x distinct values columns , b (they share same set of options), , y total count of rows having option:
you can summing columns (basically concatenating contents) , calling pd.value_counts
on them. example (modifying dataframe definition not raise error):
df = pd.dataframe(dict(a=[(1,2),(2,3),(1,)], b=[(1,3),(2,5),()], c=['a','b','c'])) counts = pd.dataframe({col: pd.value_counts(df[col].sum()) col in ['a', 'b']}) counts.plot(kind='bar')
(previous answer original version of question):
you can rows 2
in a
using map, e.g.
>>> df = pd.dataframe(dict(a=[[1,2],[2,3],[1,3]], b=['a','b','c'])) >>> df b 0 [1, 2] 1 [2, 3] b 2 [1, 3] c >>> df[df.a.map(lambda l: 2 in l)] b 0 [1, 2] 1 [2, 3] b
you accomplish similar using groupby
followed filter
, though first have convert a
values tuple they're hashable (and can group keys):
>>> df.groupby(df.a.map(tuple)).filter(lambda group: 2 in group.name) b 0 [1, 2] 1 [2, 3] b
once have either of these results, can use, e.g. result['a'] = 2
replace values in a
column.
Comments
Post a Comment