python - Pandas: expand list in column to distinct rows -


i have dataset big number of columns contain several values (imported google forms, columns allowing multiple selection). i've imported lists initially.

now want analyse data based on values columns, i.e. given

df = pd.dataframe(dict(a=[(1,2),(2,3),(1,)], b=[(1,3),(2,5),], c=['a','b','c']))                b  c 0  (1, 2)  (1, 3)  1  (2, 3)  (2, 5)  b 2     (1)      ()  c 

i want plot bar chart x distinct values columns , b (they share same set of options), , y total count of rows having option:

you can summing columns (basically concatenating contents) , calling pd.value_counts on them. example (modifying dataframe definition not raise error):

df = pd.dataframe(dict(a=[(1,2),(2,3),(1,)],                        b=[(1,3),(2,5),()],                        c=['a','b','c'])) counts = pd.dataframe({col: pd.value_counts(df[col].sum())                        col in ['a', 'b']}) counts.plot(kind='bar') 

enter image description here


(previous answer original version of question):

you can rows 2 in a using map, e.g.

>>> df = pd.dataframe(dict(a=[[1,2],[2,3],[1,3]], b=['a','b','c'])) >>> df          b 0  [1, 2]  1  [2, 3]  b 2  [1, 3]  c  >>> df[df.a.map(lambda l: 2 in l)]          b 0  [1, 2]  1  [2, 3]  b 

you accomplish similar using groupby followed filter, though first have convert a values tuple they're hashable (and can group keys):

>>> df.groupby(df.a.map(tuple)).filter(lambda group: 2 in group.name)          b 0  [1, 2]  1  [2, 3]  b 

once have either of these results, can use, e.g. result['a'] = 2 replace values in a column.


Comments

Popular posts from this blog

javascript - Slick Slider width recalculation -

jsf - PrimeFaces Datatable - What is f:facet actually doing? -

http - Safari render HTML as received -