如果有任何坐标落在彼此之间的特定距离内,我如何确定哪些坐标?

Posted

技术标签:

【中文标题】如果有任何坐标落在彼此之间的特定距离内,我如何确定哪些坐标?【英文标题】:How can I determine what coordinates if any fall within a specific distance between each other? 【发布时间】:2021-07-30 12:36:50 【问题描述】:
from math import radians, cos, sin, asin, sqrt

df = pd.DataFrame(columns=['Id', 'Feature', 'Lat', 'Long'])
df['Id'] = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
df['Feature'] = ['Truck', 'Truck', 'Truck', 'Truck', 'Truck', 'Van', 'Van', 'Van', 'Van', 'Car', 'Car', 'Car']
df['Lat'] = [39.57713, 39.57723, 39.57671, 39.57672, 39.57697, 39.57188, 39.57151, 39.57153, 39.57197, 39.57613, 39.57577, 39.57595]
df['Long'] = [46.87062, 46.87004, 46.87001, 46.87066, 46.87027, 46.87489, 46.87482, 46.8752, 46.87528, 46.8757, 46.87572, 46.87545]

def haversine(lon1, lat1, lon2, lat2):
    """
    Calculate the great circle distance between two points 
    on the earth (specified in decimal degrees)
    """
    # convert decimal degrees to radians 
    lon1, lat1, lon2, lat2 = map(radians, [lon1, lat1, lon2, lat2])
    # haversine formula 
    dlon = lon2 - lon1 
    dlat = lat2 - lat1 
    a = sin(dlat/2)**2 + cos(lat1) * cos(lat2) * sin(dlon/2)**2
    c = 2 * asin(sqrt(a)) 
    # Radius of earth in meters is 6371000
    distance = 6371000* c
    return distance

如何查看 420 米内的卡车/汽车 ID、655 米内的卡车/货车 ID 和 425 米内的汽车/货车 ID?

理想的输出是:

卡车 3 在 11 号车的距离内

Truck 3 在 Van 5 的距离内

10 号车在 8 号货车的距离内

【问题讨论】:

X 米以内是什么?给定点?彼此的? 这能回答你的问题吗? Pandas: calculate haversine distance within each group of rows @Cimbali Trucks -> Cars(汽车 420 米范围内有哪些卡车) Trucks -> Vans(货车 655 米范围内有哪些卡车) Cars -> Vans(汽车 425 米范围内有哪些汽车)货车) @mozway Negative.. 那是计算相同 ID 之间的距离。 您所有的Ids 在这里都不同。 Id 列应该是 [0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3,] 吗?如果不是,请提供预期的输出。 【参考方案1】:

你可以用pd.merge(how='cross')生成所有你想要的对:

>>> groups = df.groupby('Feature')
>>> pd.merge(groups.get_group('Car'), groups.get_group('Truck'), how='cross', suffixes=('', '_cmp'))
    Id Feature       Lat      Long  Id_cmp Feature_cmp   Lat_cmp  Long_cmp
0    9     Car  39.57613  46.87570       0       Truck  39.57713  46.87062
1    9     Car  39.57613  46.87570       1       Truck  39.57723  46.87004
2    9     Car  39.57613  46.87570       2       Truck  39.57671  46.87001
3    9     Car  39.57613  46.87570       3       Truck  39.57672  46.87066
4    9     Car  39.57613  46.87570       4       Truck  39.57697  46.87027
5   10     Car  39.57577  46.87572       0       Truck  39.57713  46.87062
6   10     Car  39.57577  46.87572       1       Truck  39.57723  46.87004
7   10     Car  39.57577  46.87572       2       Truck  39.57671  46.87001
8   10     Car  39.57577  46.87572       3       Truck  39.57672  46.87066
9   10     Car  39.57577  46.87572       4       Truck  39.57697  46.87027
10  11     Car  39.57595  46.87545       0       Truck  39.57713  46.87062
11  11     Car  39.57595  46.87545       1       Truck  39.57723  46.87004
12  11     Car  39.57595  46.87545       2       Truck  39.57671  46.87001
13  11     Car  39.57595  46.87545       3       Truck  39.57672  46.87066
14  11     Car  39.57595  46.87545       4       Truck  39.57697  46.87027

这可以轻松生成我们想要进行的所有比较:

>>> distances = ('Car', 'Truck'): 420, ('Truck', 'Van'): 655, ('Car', 'Van'): 425
>>> all_cmp = pd.concat([pd.merge(groups.get_group(dist_from), groups.get_group(dist_to), how='cross', suffixes=('', '_cmp')) for dist_from, dist_to in distances])
>>> all_cmp.head()
   Id Feature       Lat     Long  Id_cmp Feature_cmp   Lat_cmp  Long_cmp
0   9     Car  39.57613  46.8757       0       Truck  39.57713  46.87062
1   9     Car  39.57613  46.8757       1       Truck  39.57723  46.87004
2   9     Car  39.57613  46.8757       2       Truck  39.57671  46.87001
3   9     Car  39.57613  46.8757       3       Truck  39.57672  46.87066
4   9     Car  39.57613  46.8757       4       Truck  39.57697  46.87027
>>> all_cmp.tail()
    Id Feature       Lat      Long  Id_cmp Feature_cmp   Lat_cmp  Long_cmp
7   10     Car  39.57577  46.87572       8         Van  39.57197  46.87528
8   11     Car  39.57595  46.87545       5         Van  39.57188  46.87489
9   11     Car  39.57595  46.87545       6         Van  39.57151  46.87482
10  11     Car  39.57595  46.87545       7         Van  39.57153  46.87520
11  11     Car  39.57595  46.87545       8         Van  39.57197  46.87528

我们可以很容易地计算距离,我们还需要对齐阈值距离:

>>> dist = all_cmp.agg(lambda s: haversine(s['Lat'], s['Long'], s['Lat_cmp'], s['Long_cmp']), axis='columns')
>>> thresh = all_cmp[['Feature', 'Feature_cmp']].agg(lambda s: distances[tuple(s)], axis='columns')

然后从那里比较,保留你想要的行,可能聚合:

>>> all_cmp[dist < thresh]
    Id Feature       Lat      Long  Id_cmp Feature_cmp   Lat_cmp  Long_cmp
0    0   Truck  39.57713  46.87062       5         Van  39.57188  46.87489
1    0   Truck  39.57713  46.87062       6         Van  39.57151  46.87482
3    0   Truck  39.57713  46.87062       8         Van  39.57197  46.87528
12   3   Truck  39.57672  46.87066       5         Van  39.57188  46.87489
13   3   Truck  39.57672  46.87066       6         Van  39.57151  46.87482
14   3   Truck  39.57672  46.87066       7         Van  39.57153  46.87520
15   3   Truck  39.57672  46.87066       8         Van  39.57197  46.87528
16   4   Truck  39.57697  46.87027       5         Van  39.57188  46.87489
17   4   Truck  39.57697  46.87027       6         Van  39.57151  46.87482
0    9     Car  39.57613  46.87570       5         Van  39.57188  46.87489
1    9     Car  39.57613  46.87570       6         Van  39.57151  46.87482
2    9     Car  39.57613  46.87570       7         Van  39.57153  46.87520
3    9     Car  39.57613  46.87570       8         Van  39.57197  46.87528
4   10     Car  39.57577  46.87572       5         Van  39.57188  46.87489
5   10     Car  39.57577  46.87572       6         Van  39.57151  46.87482
6   10     Car  39.57577  46.87572       7         Van  39.57153  46.87520
7   10     Car  39.57577  46.87572       8         Van  39.57197  46.87528
8   11     Car  39.57595  46.87545       5         Van  39.57188  46.87489
9   11     Car  39.57595  46.87545       6         Van  39.57151  46.87482
10  11     Car  39.57595  46.87545       7         Van  39.57153  46.87520
11  11     Car  39.57595  46.87545       8         Van  39.57197  46.87528
>>> close = all_cmp[dist < thresh].groupby('Id')['Id_cmp'].agg(list)
>>> close
Id
0        [5, 6, 8]
3     [5, 6, 7, 8]
4           [5, 6]
9     [5, 6, 7, 8]
10    [5, 6, 7, 8]
11    [5, 6, 7, 8]
Name: Id_cmp, dtype: object
>>> df.merge(close.rename('within dist').reset_index())
   Id Feature       Lat      Long   within dist
0   0   Truck  39.57713  46.87062     [5, 6, 8]
1   3   Truck  39.57672  46.87066  [5, 6, 7, 8]
2   4   Truck  39.57697  46.87027        [5, 6]
3   9     Car  39.57613  46.87570  [5, 6, 7, 8]
4  10     Car  39.57577  46.87572  [5, 6, 7, 8]
5  11     Car  39.57595  46.87545  [5, 6, 7, 8]

【讨论】:

根据我的手动计算,这些是唯一落在指定距离内的特征: Truck 3 在 Car 11 的距离内 Truck 3 在 Van 5 的距离内 Car 10 在 Van 8 的距离内 @NoobPythoner 我重用了你的 haversine 函数。检查此代码计算的距离和阈值是否与您的计算相同。

以上是关于如果有任何坐标落在彼此之间的特定距离内,我如何确定哪些坐标?的主要内容,如果未能解决你的问题,请参考以下文章

用较小的球体最佳地填充3D球体

如何确定两组纬度/经度坐标之间的距离?

“我当前的地理位置”和其他坐标之间的计算

如何计算两个坐标之间的距离(没有浮点数)?

如何比较给定时间,落在特定时间范围之间 - javascript

两个坐标列表的欧几里得距离矩阵