ELKI DBSCAN LngLatDistanceFunction producing one cluster -
i'm using elki lnglatdistancefunction cluster lon/lat points it's returning 1 cluster (was returning more clusters when used euclid distance). tried multiple epsilon values i'm still getting 1 cluster.
int minpts=20; double eps=10; listparameterization params = new listparameterization(); params.addparameter(dbscan.distance_function_id, lnglatdistancefunction.class); params.addparameter(dbscan.parameterizer.minpts_id, minpts); params.addparameter(dbscan.parameterizer.epsilon_id, eps); params.addparameter(abstractdatabase.parameterizer.database_connection_id, dbcon); params.addparameter(abstractdatabase.parameterizer.index_id, rstartreefactory.class); params.addparameter(rstartreefactory.parameterizer.bulk_split_id, sorttilerecursivebulksplit.class); params.addparameter(abstractpagefilefactory.parameterizer.page_size_id, 600); database db = classgenericsutil.parameterizeorabort(staticarraydatabase.class, params); db.initialize(); generalizeddbscan dbscan = classgenericsutil.parameterizeorabort(generalizeddbscan.class, params);
the distance in meters. therefore, need choose epsilon such - not points - have more minpts neighbors.
you can use knndistancessampler
class estimate parameter. not automatic estimation. can plot resuling distances, , check "knee" in plot.
pay attention "noise" flag.
- if single cluster, , "noise", epsilon small.
- if single cluster, , "cluster" (not noise), epsilon large.
- if single cluster, , "noise", minpts may large.
- if single cluster, , cluster, minpts may small.
for applications, easier fix minpts 4, or 10, or 20; , adjust epsilon parameter desired. geographic applications yours, may easier fix epsilon parameter, , vary minpts parameter instead. example, may know distance of less 10000 meter indicates objects "neighbors".
algorithms such optics helpful choose parameter visually. (use minigui!)
Comments
Post a Comment