numpy - 两点与形状向量之间的距离(n,2)
Posted
技术标签:
【中文标题】numpy - 两点与形状向量之间的距离(n,2)【英文标题】:numpy - distances between two points from vectors of shape(n, 2) 【发布时间】:2020-12-22 02:29:05 【问题描述】:背景
假设我有一个形状为 (n, 2) 的坐标数组,其中每个坐标为 (x, y)。
X = np.random.random(shape) * 10 # just to generate (x,y).
---
[[9.47743968 8.60682597]
[7.35620992 6.87031756]
[5.05200433 3.62373581]
[4.33732145 3.72994235]
[4.34982473 4.46453609]]
...
X 中两个向量Xi
和Xj
之间的距离是|Xj - Xi|
。获取X中所有组合的距离可以如图所示。
问题
是否可以仅使用 numpy 而不是 scipy 来执行此操作(例如 scipy.spatial.distance.pdist(X, metric='euclidean', *args, **kwargs)?请帮助了解可用的 numpy 函数以及如何实现它。
研究
科学
虽然我可以查看 scipy 代码,因为 scipy.spatial.distance.pdist(X, metric='euclidean', *args, **kwargs) 似乎可以得到向量的距离。
n 维空间中观测值之间的成对距离。
import scipy.spatial
scipy.spatial.distance.pdist(X)
---
array([2.74136411, 6.6645079 , 7.08553522, 6.59173729, 3.9811627 ,
4.35610423, 3.85047223, 0.72253128, 1.09544571, 0.73470014])
scipy distance.py line 2050-2059 似乎是为相应方法(例如欧几里得)调用距离函数的代码。但是它进入C code distance_wrap.c,因此不是numpy。
static PyObject *pdist_seuclidean_double_wrap(PyObject *self, PyObject *args,
PyObject *kwargs)
PyArrayObject *X_, *dm_, *var_;
int m, n;
double *dm;
const double *X, *var;
static char *kwlist[] = "X", "dm", "V", NULL;
if (!PyArg_ParseTupleAndKeywords(args, kwargs,
"O!O!O!:pdist_seuclidean_double_wrap", kwlist,
&PyArray_Type, &X_,
&PyArray_Type, &dm_,
&PyArray_Type, &var_))
return 0;
else
NPY_BEGIN_ALLOW_THREADS;
X = (double*)X_->data;
dm = (double*)dm_->data;
var = (double*)var_->data;
m = X_->dimensions[0];
n = X_->dimensions[1];
pdist_seuclidean(X, var, dm, m, n);
NPY_END_ALLOW_THREADS;
return Py_BuildValue("d", 0.0);
【问题讨论】:
【参考方案1】:shape = (10,2)
X = np.random.random(shape)
# X[:,0] --> x values
# X[:,1] --> y values
dist = np.sqrt(np.sum((X[:,np.newaxis,:] - X[np.newaxis,:,:]) ** 2, axis = -1))
【讨论】:
你写的是距离的平方,而不是距离本身 多么简单。非常感谢。我调查了这个 np.newaxis 在做什么。 它对广播起着至关重要的作用:numpy.org/devdocs/user/theory.broadcasting.html 看起来还是有点不对劲,适合使用的函数是 numpy.hypot,它可以处理距离计算中的很多极端情况【参考方案2】:使用 numpy 在绘图中遵循您的过程的示例代码:
import numpy as np
N = 3
shape = (N, 2)
X = np.random.random(shape) * 10
print("X:\n\n\n".format(X))
X_br = np.broadcast_to(X, (N, N, 2))
X_bc = np.repeat(X[:, np.newaxis], N, axis=1)
print("X broadcast to row:\n\n\nX broadcast to column:\n\n\n".format(X_br, X_bc))
sub_matrix = X_bc - X_br
distance_matrix = np.sqrt(np.sum(sub_matrix**2, axis = -1))
print("X_bc - X_br:\n\n\nAll distance:\n".format(sub_matrix, distance_matrix))
输出:
X:
[[2.05479961 2.74672028]
[2.10302902 5.36635678]
[4.82345137 9.0050768 ]]
X broadcast to row:
[[[2.05479961 2.74672028]
[2.10302902 5.36635678]
[4.82345137 9.0050768 ]]
[[2.05479961 2.74672028]
[2.10302902 5.36635678]
[4.82345137 9.0050768 ]]
[[2.05479961 2.74672028]
[2.10302902 5.36635678]
[4.82345137 9.0050768 ]]]
X broadcast to column:
[[[2.05479961 2.74672028]
[2.05479961 2.74672028]
[2.05479961 2.74672028]]
[[2.10302902 5.36635678]
[2.10302902 5.36635678]
[2.10302902 5.36635678]]
[[4.82345137 9.0050768 ]
[4.82345137 9.0050768 ]
[4.82345137 9.0050768 ]]]
X_bc - X_br:
[[[ 0. 0. ]
[-0.04822941 -2.6196365 ]
[-2.76865177 -6.25835652]]
[[ 0.04822941 2.6196365 ]
[ 0. 0. ]
[-2.72042236 -3.63872002]]
[[ 2.76865177 6.25835652]
[ 2.72042236 3.63872002]
[ 0. 0. ]]]
All distance:
[[0. 2.62008043 6.8434245 ]
[2.62008043 0. 4.54323466]
[6.8434245 4.54323466 0. ]]
【讨论】:
以上是关于numpy - 两点与形状向量之间的距离(n,2)的主要内容,如果未能解决你的问题,请参考以下文章