python - Query in pandas for closest object (in time) which meets a set of conditions -
i using pandas manage set of files have several properties:
import pandas pd data = {'objtype' : ['bias', 'bias', 'flat', 'flat', 'stdstar', 'flat', 'arc', 'target1', 'arc', 'flat', 'flat', 'flat', 'bias', 'bias'], 'ut' : pd.date_range("11:00", "12:05", freq="5min").values, 'position' : ['p0', 'p0', 'p0', 'p0', 'p1', 'p1','p1', 'p2','p2','p2', 'p0', 'p0', 'p0', 'p0']} df = pd.dataframe(data=data) which gives me dataframe one:
objtype position ut 0 bias p0 2016-07-15 11:00:00 1 bias p0 2016-07-15 11:05:00 2 flat p0 2016-07-15 11:10:00 3 flat p0 2016-07-15 11:15:00 4 stdstar p1 2016-07-15 11:20:00 5 flat p1 2016-07-15 11:25:00 6 arc p1 2016-07-15 11:30:00 7 target1 p2 2016-07-15 11:35:00 8 arc p2 2016-07-15 11:40:00 9 flat p2 2016-07-15 11:45:00 10 flat p0 2016-07-15 11:50:00 11 flat p0 2016-07-15 11:55:00 12 bias p0 2016-07-15 12:00:00 13 bias p0 2016-07-15 12:05:00 i index objects meet temporal condition in addition one. example:
i want the closest object target1 objtype 'arc'. query 2 candidates: 6 , 8.
if example query closest object target1 objtype 'arc' , shares same position (p2). 8.
i trying slice data frame according initial conditions , using numpy making non-pythonic mess.
any advice?
let's build function
def get_closest(df, idx, bool_cond, to_this): others = df.loc[bool_cond, to_this] target = df.loc[idx, to_this] return df.ix[(others - target).abs().idxmin()] first, assume when looking somethihg closest else have unique index. if don't, it. in case, index 7 corresponding value 'target1'. next, build boolean series representing conditions care about.
cond1 = df.objtype == 'arc' cond2 = df.position == df.loc[7, 'position'] then can call our function like:
get_closest(df, 7, cond1, 'ut') objtype arc position p1 ut 2016-07-15 11:30:00 name: 6, dtype: object perfect! mentioned there 2 items close, didn't care deliver both. i'll leave exercise you. function did deliver row closest , satisfied conditions.
what about:
get_closest(df, 7, cond1 & cond2, 'ut') objtype arc position p2 ut 2016-07-15 11:40:00 name: 8, dtype: object great! that's wanted.
explanation of get_closest
dfdataframe care about.idxindex represents our target.bool_condtrue/falseseries slice ourdfto_thiscolumn name use measure distance from.
def get_closest(df, idx, bool_cond, to_this): # filter dataframe others = df.loc[bool_cond, to_this] # to_this value target row target = df.loc[idx, to_this] # index value smallest absolute difference # , use resulting row return df.ix[(others - target).abs().idxmin()]
Comments
Post a Comment