python - Query in pandas for closest object (in time) which meets a set of conditions -
i using pandas manage set of files have several properties:
import pandas pd data = {'objtype' : ['bias', 'bias', 'flat', 'flat', 'stdstar', 'flat', 'arc', 'target1', 'arc', 'flat', 'flat', 'flat', 'bias', 'bias'], 'ut' : pd.date_range("11:00", "12:05", freq="5min").values, 'position' : ['p0', 'p0', 'p0', 'p0', 'p1', 'p1','p1', 'p2','p2','p2', 'p0', 'p0', 'p0', 'p0']} df = pd.dataframe(data=data)
which gives me dataframe one:
objtype position ut 0 bias p0 2016-07-15 11:00:00 1 bias p0 2016-07-15 11:05:00 2 flat p0 2016-07-15 11:10:00 3 flat p0 2016-07-15 11:15:00 4 stdstar p1 2016-07-15 11:20:00 5 flat p1 2016-07-15 11:25:00 6 arc p1 2016-07-15 11:30:00 7 target1 p2 2016-07-15 11:35:00 8 arc p2 2016-07-15 11:40:00 9 flat p2 2016-07-15 11:45:00 10 flat p0 2016-07-15 11:50:00 11 flat p0 2016-07-15 11:55:00 12 bias p0 2016-07-15 12:00:00 13 bias p0 2016-07-15 12:05:00
i index objects meet temporal condition in addition one. example:
i want the closest object target1
objtype
'arc'. query 2 candidates: 6 , 8.
if example query closest object target1
objtype
'arc' , shares same position
(p2). 8.
i trying slice data frame according initial conditions , using numpy making non-pythonic mess.
any advice?
let's build function
def get_closest(df, idx, bool_cond, to_this): others = df.loc[bool_cond, to_this] target = df.loc[idx, to_this] return df.ix[(others - target).abs().idxmin()]
first, assume when looking somethihg closest else have unique index. if don't, it. in case, index 7
corresponding value 'target1'
. next, build boolean series representing conditions care about.
cond1 = df.objtype == 'arc' cond2 = df.position == df.loc[7, 'position']
then can call our function like:
get_closest(df, 7, cond1, 'ut') objtype arc position p1 ut 2016-07-15 11:30:00 name: 6, dtype: object
perfect! mentioned there 2 items close, didn't care deliver both. i'll leave exercise you. function did deliver row closest , satisfied conditions.
what about:
get_closest(df, 7, cond1 & cond2, 'ut') objtype arc position p2 ut 2016-07-15 11:40:00 name: 8, dtype: object
great! that's wanted.
explanation of get_closest
df
dataframe care about.idx
index represents our target.bool_cond
true
/false
series slice ourdf
to_this
column name use measure distance from.
def get_closest(df, idx, bool_cond, to_this): # filter dataframe others = df.loc[bool_cond, to_this] # to_this value target row target = df.loc[idx, to_this] # index value smallest absolute difference # , use resulting row return df.ix[(others - target).abs().idxmin()]
Comments
Post a Comment