如果有任何坐标落在彼此之间的特定距离内,我如何确定哪些坐标?
Posted
技术标签:
【中文标题】如果有任何坐标落在彼此之间的特定距离内,我如何确定哪些坐标?【英文标题】:How can I determine what coordinates if any fall within a specific distance between each other? 【发布时间】:2021-07-30 12:36:50 【问题描述】:from math import radians, cos, sin, asin, sqrt
df = pd.DataFrame(columns=['Id', 'Feature', 'Lat', 'Long'])
df['Id'] = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
df['Feature'] = ['Truck', 'Truck', 'Truck', 'Truck', 'Truck', 'Van', 'Van', 'Van', 'Van', 'Car', 'Car', 'Car']
df['Lat'] = [39.57713, 39.57723, 39.57671, 39.57672, 39.57697, 39.57188, 39.57151, 39.57153, 39.57197, 39.57613, 39.57577, 39.57595]
df['Long'] = [46.87062, 46.87004, 46.87001, 46.87066, 46.87027, 46.87489, 46.87482, 46.8752, 46.87528, 46.8757, 46.87572, 46.87545]
def haversine(lon1, lat1, lon2, lat2):
"""
Calculate the great circle distance between two points
on the earth (specified in decimal degrees)
"""
# convert decimal degrees to radians
lon1, lat1, lon2, lat2 = map(radians, [lon1, lat1, lon2, lat2])
# haversine formula
dlon = lon2 - lon1
dlat = lat2 - lat1
a = sin(dlat/2)**2 + cos(lat1) * cos(lat2) * sin(dlon/2)**2
c = 2 * asin(sqrt(a))
# Radius of earth in meters is 6371000
distance = 6371000* c
return distance
如何查看 420 米内的卡车/汽车 ID、655 米内的卡车/货车 ID 和 425 米内的汽车/货车 ID?
理想的输出是:
卡车 3 在 11 号车的距离内
Truck 3 在 Van 5 的距离内
10 号车在 8 号货车的距离内
【问题讨论】:
X 米以内是什么?给定点?彼此的? 这能回答你的问题吗? Pandas: calculate haversine distance within each group of rows @Cimbali Trucks -> Cars(汽车 420 米范围内有哪些卡车) Trucks -> Vans(货车 655 米范围内有哪些卡车) Cars -> Vans(汽车 425 米范围内有哪些汽车)货车) @mozway Negative.. 那是计算相同 ID 之间的距离。 您所有的Id
s 在这里都不同。 Id
列应该是 [0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3,]
吗?如果不是,请提供预期的输出。
【参考方案1】:
你可以用pd.merge(how='cross')
生成所有你想要的对:
>>> groups = df.groupby('Feature')
>>> pd.merge(groups.get_group('Car'), groups.get_group('Truck'), how='cross', suffixes=('', '_cmp'))
Id Feature Lat Long Id_cmp Feature_cmp Lat_cmp Long_cmp
0 9 Car 39.57613 46.87570 0 Truck 39.57713 46.87062
1 9 Car 39.57613 46.87570 1 Truck 39.57723 46.87004
2 9 Car 39.57613 46.87570 2 Truck 39.57671 46.87001
3 9 Car 39.57613 46.87570 3 Truck 39.57672 46.87066
4 9 Car 39.57613 46.87570 4 Truck 39.57697 46.87027
5 10 Car 39.57577 46.87572 0 Truck 39.57713 46.87062
6 10 Car 39.57577 46.87572 1 Truck 39.57723 46.87004
7 10 Car 39.57577 46.87572 2 Truck 39.57671 46.87001
8 10 Car 39.57577 46.87572 3 Truck 39.57672 46.87066
9 10 Car 39.57577 46.87572 4 Truck 39.57697 46.87027
10 11 Car 39.57595 46.87545 0 Truck 39.57713 46.87062
11 11 Car 39.57595 46.87545 1 Truck 39.57723 46.87004
12 11 Car 39.57595 46.87545 2 Truck 39.57671 46.87001
13 11 Car 39.57595 46.87545 3 Truck 39.57672 46.87066
14 11 Car 39.57595 46.87545 4 Truck 39.57697 46.87027
这可以轻松生成我们想要进行的所有比较:
>>> distances = ('Car', 'Truck'): 420, ('Truck', 'Van'): 655, ('Car', 'Van'): 425
>>> all_cmp = pd.concat([pd.merge(groups.get_group(dist_from), groups.get_group(dist_to), how='cross', suffixes=('', '_cmp')) for dist_from, dist_to in distances])
>>> all_cmp.head()
Id Feature Lat Long Id_cmp Feature_cmp Lat_cmp Long_cmp
0 9 Car 39.57613 46.8757 0 Truck 39.57713 46.87062
1 9 Car 39.57613 46.8757 1 Truck 39.57723 46.87004
2 9 Car 39.57613 46.8757 2 Truck 39.57671 46.87001
3 9 Car 39.57613 46.8757 3 Truck 39.57672 46.87066
4 9 Car 39.57613 46.8757 4 Truck 39.57697 46.87027
>>> all_cmp.tail()
Id Feature Lat Long Id_cmp Feature_cmp Lat_cmp Long_cmp
7 10 Car 39.57577 46.87572 8 Van 39.57197 46.87528
8 11 Car 39.57595 46.87545 5 Van 39.57188 46.87489
9 11 Car 39.57595 46.87545 6 Van 39.57151 46.87482
10 11 Car 39.57595 46.87545 7 Van 39.57153 46.87520
11 11 Car 39.57595 46.87545 8 Van 39.57197 46.87528
我们可以很容易地计算距离,我们还需要对齐阈值距离:
>>> dist = all_cmp.agg(lambda s: haversine(s['Lat'], s['Long'], s['Lat_cmp'], s['Long_cmp']), axis='columns')
>>> thresh = all_cmp[['Feature', 'Feature_cmp']].agg(lambda s: distances[tuple(s)], axis='columns')
然后从那里比较,保留你想要的行,可能聚合:
>>> all_cmp[dist < thresh]
Id Feature Lat Long Id_cmp Feature_cmp Lat_cmp Long_cmp
0 0 Truck 39.57713 46.87062 5 Van 39.57188 46.87489
1 0 Truck 39.57713 46.87062 6 Van 39.57151 46.87482
3 0 Truck 39.57713 46.87062 8 Van 39.57197 46.87528
12 3 Truck 39.57672 46.87066 5 Van 39.57188 46.87489
13 3 Truck 39.57672 46.87066 6 Van 39.57151 46.87482
14 3 Truck 39.57672 46.87066 7 Van 39.57153 46.87520
15 3 Truck 39.57672 46.87066 8 Van 39.57197 46.87528
16 4 Truck 39.57697 46.87027 5 Van 39.57188 46.87489
17 4 Truck 39.57697 46.87027 6 Van 39.57151 46.87482
0 9 Car 39.57613 46.87570 5 Van 39.57188 46.87489
1 9 Car 39.57613 46.87570 6 Van 39.57151 46.87482
2 9 Car 39.57613 46.87570 7 Van 39.57153 46.87520
3 9 Car 39.57613 46.87570 8 Van 39.57197 46.87528
4 10 Car 39.57577 46.87572 5 Van 39.57188 46.87489
5 10 Car 39.57577 46.87572 6 Van 39.57151 46.87482
6 10 Car 39.57577 46.87572 7 Van 39.57153 46.87520
7 10 Car 39.57577 46.87572 8 Van 39.57197 46.87528
8 11 Car 39.57595 46.87545 5 Van 39.57188 46.87489
9 11 Car 39.57595 46.87545 6 Van 39.57151 46.87482
10 11 Car 39.57595 46.87545 7 Van 39.57153 46.87520
11 11 Car 39.57595 46.87545 8 Van 39.57197 46.87528
>>> close = all_cmp[dist < thresh].groupby('Id')['Id_cmp'].agg(list)
>>> close
Id
0 [5, 6, 8]
3 [5, 6, 7, 8]
4 [5, 6]
9 [5, 6, 7, 8]
10 [5, 6, 7, 8]
11 [5, 6, 7, 8]
Name: Id_cmp, dtype: object
>>> df.merge(close.rename('within dist').reset_index())
Id Feature Lat Long within dist
0 0 Truck 39.57713 46.87062 [5, 6, 8]
1 3 Truck 39.57672 46.87066 [5, 6, 7, 8]
2 4 Truck 39.57697 46.87027 [5, 6]
3 9 Car 39.57613 46.87570 [5, 6, 7, 8]
4 10 Car 39.57577 46.87572 [5, 6, 7, 8]
5 11 Car 39.57595 46.87545 [5, 6, 7, 8]
【讨论】:
根据我的手动计算,这些是唯一落在指定距离内的特征: Truck 3 在 Car 11 的距离内 Truck 3 在 Van 5 的距离内 Car 10 在 Van 8 的距离内 @NoobPythoner 我重用了你的haversine
函数。检查此代码计算的距离和阈值是否与您的计算相同。以上是关于如果有任何坐标落在彼此之间的特定距离内,我如何确定哪些坐标?的主要内容,如果未能解决你的问题,请参考以下文章