string - Split rows according to text in two columns (Python, Pandas) -
this dataframe (with many more letters , length of ~35.5k) , stuff – other relevant strings). variables strings , ['c1','c2'] multiindex.
tmp c1 c2 c3 c4 c5 start end c8 1 - - - 12 14 - 2 - - - 1,4,7 3,6,10 - 3 - - - 16,19 17,21 - 4 - - - 22 24 -
i need become (split every row contains commas maintaining else):
c1 c2 c3 c4 c5 start end c8 appearance 1 - - - 12 14 - 1 2 - - - 1 3 - 1 2 - - - 4 6 - 2 2 - - - 7 10 - 3 3 - - - 16 17 - 1 3 - - - 19 21 - 2 4 - - - 22 24 - 1
i tried script pandas: how split text in column multiple rows?
as
s = tmp['start'].str.split(',').apply(series, 1).stack() s.index = s.index.droplevel(-1) s.name = 'start del tmp['start'] final = tmp.join(s)
but result larger should! thousands of repeats , trying split 'start'. can't imagine trying both start , end (every comma in 'start' implies comma in 'end'.
lengths: tmp = 35568 s = 35676 final = 293408
you can create new df
s1
, s2
, join
. better use parameter expand=true
in str.split
, delete multiple columns drop
:
for creating column appearance
use groupby
index
cumcount
.
s1 = tmp['start'].str.split(',', expand=true).stack() s1.index = s1.index.droplevel(-1) s1.name = 'start' s2 = tmp['end'].str.split(',', expand=true).stack() s2.index = s2.index.droplevel(-1) s2.name = 'end' tmp.drop(['start', 'end'], inplace=true, axis=1) df = pd.dataframe({'s1':s1, 's2':s2}, index=s1.index) final = tmp.join(df) final['appearance'] = final.groupby(final.index).cumcount() + 1 print (final) c1 c2 c3 c4 c5 c8 s1 s2 appearance 0 1 - - - - 12 14 1 1 2 - - - - 1 3 1 1 2 - - - - 4 6 2 1 2 - - - - 7 10 3 2 3 - - - - 16 17 1 2 3 - - - - 19 21 2 3 4 - - - - 22 24 1
edit comment:
you can try reset_index
first:
print (tmp) c3 c4 c5 start end c8 c1 c2 1 - - - 12 14 - 2 - - - 1,4,7 3,6,10 - 3 - - - 16,19 17,21 - 4 - - - 22 24 - tmp.reset_index(inplace=true) print (tmp) c1 c2 c3 c4 c5 start end c8 0 1 - - - 12 14 - 1 2 - - - 1,4,7 3,6,10 - 2 3 - - - 16,19 17,21 - 3 4 - - - 22 24 -
Comments
Post a Comment