在 numpy 添加过程中隐含发生了啥？

Posted 2023-02-23

技术标签:

【中文标题】在 numpy 添加过程中隐含发生了啥？【英文标题】：What implicitly occurs during numpy addition?在 numpy 添加过程中隐含发生了什么？ 【发布时间】：2018-02-08 09:20:44 【问题描述】：

在进行 numpy 数组加法时会发生什么？我在 C++ 中创建了一个计算平方距离的 CUDA 应用程序，我使用 cdll 与 Python 交互。 Python 包装器如下所示：

def sqdist(X: np.ndarray) -> np.ndarray:

    # Organize input and output
    N, D = X.shape
    X = X.astype(np.float32)
    Y = np.zeros((N, N)).astype(np.float32)

    # Prepare memory pointers
    dataIn = X.ctypes.data_as(cdll.POINTER(cdll.c_float))
    dataOut = Y.ctypes.data_as(cdll.POINTER(cdll.c_float))

    # Call the sqdist dll
    cdll.load(_get_build_default())
    cdll.computeSquaredEuclideanDistances(dataIn, N, D, dataOut)
    cdll.unload()

    # Return as numpy array
    return Y

注意转换为 float32 以使用 numpy ctypes data_as（CUDA 使用 32 位浮点数）。现在，将此方法的输出与scipy.spatial.distance.cdist(a,a,metric='sqeuclidean') 的输出进行比较，我发现了一个奇怪的行为：

假设我有一些数据Xcl（numpy 数组）：

输入1:

a = Xcl
b = Xcl + np.zeros(Xcl.shape)
print(a.dtype, type(a), a.shape)
print(b.dtype, type(b), b.shape)
print(np.all(a == b))

输出1:

float32 <class 'numpy.ndarray'> (582, 115)
float64 <class 'numpy.ndarray'> (582, 115)
True

输入[2]：

scipydist = scipy.spatial.distance.cdist(a, a, metric='sqeuclidean')
cudadist1 = cuda.sqdist(a)
cudadist2 = cuda.sqdist(b)

plt.figure(figsize=(20, 5))
plt.subplot(131)
plt.imshow(scipydist, vmax=3000)
plt.colorbar()
plt.title("scipydist")
plt.subplot(132)
plt.imshow(cudadist1, vmax=3000)
plt.colorbar()
plt.title("cudadist1")
plt.subplot(133)
plt.imshow(cudadist2, vmax=3000)
plt.colorbar()
plt.title("cudadist2")
plt.show()

输出[2]：

即根据我是否在输入中添加零，我会得到不同的 CUDA 算法输出。 怎么会（见鬼）发生这种情况？在 numpy 加法过程中隐式发生了什么？ 与 np.ones 相乘也是如此。

【问题讨论】：

我怀疑这是类型提升，您的 np.zeros 数组可能默认为 np.float64。是的，np.zeros 提升为 float64。请看我在下面发布的答案。那么 float64 必须有不同的底层内存布局？不，np.float64 数组可以是任何顺序，但新创建的数组应该默认为“C”顺序，AFAIK，但我正在寻找相关文档 juanpa.arrivillaga 对于第一个示例是正确的。在 python 中创建新的 ndarray 时，它默认为双精度。 ' 尝试替换示例中的第二行：b = Xcl + np.zeros_like(Xcl) 【参考方案1】：

好的。这似乎是由于某种内存布局。在我的包装器中使用np.astype，它默认为order='K'：

K表示尽可能接近数组元素在内存中出现的顺序

而 CUDA 应用程序希望数据按 C 顺序排列。将包装器更新为以下内容可解决问题：

X = X.astype(np.float32, order='C')
Y = np.zeros((N, N)).astype(np.float32, order='C')

因此我猜想 numpy 加法会隐式地将基础数据重新排序为适合它的任何内容？

【讨论】：

Xcl的顺序是什么？如何检查 numpy 数组的顺序？ X.flags、X.__array_interface__ 和 X.strides 都提供了有用的信息，但并不总是很容易解释。查看astype 之前和之后的这些内容，以了解当您强制执行C 订单时会发生什么变化。

以上是关于在 numpy 添加过程中隐含发生了啥？的主要内容，如果未能解决你的问题，请参考以下文章

不能在存储过程中添加两个数字。我错过了啥？

当用户输入一个url地址后，到看到页面的过程，期间发生了啥？

从输入 URL 到页面加载完成的过程中都发生了啥事情

js中new 一个对象发生了啥？

JavaScript中用new操作符创建对象的时候具体发生了啥过程

图解域名解析成IP的全过程（你浏览器摁下一个网址后发生了啥？）