Python:如何将给定距离内的点组合在一起?

Posted

技术标签:

【中文标题】Python:如何将给定距离内的点组合在一起?【英文标题】:Python: how to group together points in a given distance? 【发布时间】:2016-04-29 10:05:37 【问题描述】:

我有一个数据框,其中包含用户在不同点(纬度/经度)之间的原始目的地旅行。所以我们有Origin_X, Origin_YDestination_X, Destination_Y

df:

Trip Origin_X  Origin_Y  Destination_X Destination_Y
1   -33.55682 -70.78614   -33.44007     -70.6552
2   -33.49097 -70.77741   -33.48908     -70.76263
3   -33.37108 -70.6711    -33.73425     -70.76278

我想将起点和终点处半径为1km 的所有Trip 组合在一起。如果两个行程的起点和终点距离为d<=1km,则可以将它们分组。为了计算两个坐标之间的距离,我使用了haversine 函数。

def haversine(lon1, lat1, lon2, lat2):
    """
    Calculate the great circle distance between two points 
    on the earth (specified in decimal degrees)
    """
    # convert decimal degrees to radians 
    lon1, lat1, lon2, lat2 = map(radians, [lon1, lat1, lon2, lat2])

    # haversine formula 
    dlon = lon2 - lon1 
    dlat = lat2 - lat1 
    a = sin(dlat/2)**2 + cos(lat1) * cos(lat2) * sin(dlon/2)**2
    c = 2 * asin(sqrt(a)) 
    r = 6371 # Radius of earth in kilometers. Use 3956 for miles
    return c * r

【问题讨论】:

请检查此问题以获取计算半正弦的矢量化方法,您可以将其添加为新的距离列,然后对 df 进行存储/过滤:***.com/questions/25767596/… 【参考方案1】:

你可以这样做:

import pandas as pd
from math import *

def haversine(row):
    """
    Calculate the great circle distance between two points 
    on the earth (specified in decimal degrees)
    """
    # convert decimal degrees to radians 
    lon1 = row[1]
    lat1 = row[2]
    lon2 = row[3]
    lat2 = row[4]
    lon1, lat1, lon2, lat2 = map(radians, [lon1, lat1, lon2, lat2])

    # haversine formula 
    dlon = lon2 - lon1 
    dlat = lat2 - lat1 
    a = sin(dlat/2)**2 + cos(lat1) * cos(lat2) * sin(dlon/2)**2
    c = 2 * asin(sqrt(a)) 
    r = 6371 # Radius of earth in kilometers. Use 3956 for miles
    return c * r

#Copy the trip details provided by in this question

df = pd.read_clipboard()
df['dist'] = df.apply(haversine, axis=1)

print df

   Trip  Origin_X  Origin_Y  Destination_X  Destination_Y       dist
0     1 -33.55682 -70.78614      -33.44007      -70.65520  15.177680
1     2 -33.49097 -70.77741      -33.48908      -70.76263   1.644918
2     3 -33.37108 -70.67110      -33.73425      -70.76278  16.785898
#To group
dfg = df.groupby(df['dist'] < 1)

#Just to select all the trips that are less than 2 radius
df[df['dist']<2]
   Trip  Origin_X  Origin_Y  Destination_X  Destination_Y      dist
1     2 -33.49097 -70.77741      -33.48908      -70.76263  1.644918

【讨论】:

【参考方案2】:

您可以遍历每个点,计算到所有其他点的距离,然后检查距离是否小于或等于 1 公里,并将其添加到字典中,其中键是原点,值是所有的数组关闭点...

【讨论】:

以上是关于Python:如何将给定距离内的点组合在一起?的主要内容,如果未能解决你的问题,请参考以下文章

确定给定半径算法内的点

给定数百万个点,找到位于线上或距线 0.2 毫米距离范围内的点 [关闭]

如何计算具有 lat+long 信息的集合的地理空间距离?

如何在不同的数据帧中选择特定时间段内的点,然后根据纬度/经度选择这两个点之间的距离

数据分析系列 之python语言中的聚类分析

使用经度和纬度查找给定距离内的所有附近客户