从 numpy.uint8 数组中提取无符号字符

Posted 2023-02-23

技术标签:

【中文标题】从 numpy.uint8 数组中提取无符号字符【英文标题】：Extracting unsigned char from array of numpy.uint8 【发布时间】：2014-10-27 19:27:36 【问题描述】：

我有从 python 序列中提取数值的代码，它在大多数情况下都能正常工作，但不适用于 numpy 数组。

当我尝试提取无符号字符时，我会执行以下操作

unsigned char val = boost::python::extract<unsigned char>(sequence[n]);

其中序列是任何python序列，n是索引。我收到以下错误：

TypeError: No registered converter was able to produce a C++ rvalue of type 
unsigned char from this Python object of type numpy.uint8

如何在 C++ 中成功提取无符号字符？我是否必须为 numpy 类型编写/注册特殊转换器？我宁愿使用与其他 python 序列相同的代码，而不必编写使用 PyArrayObject* 的特殊代码。

【问题讨论】：

Numpy 使用原生 c 类型，因此您的目标不是转换值，而是直接使用它（例如，通过找出它的内存位置）。 sequence 是 boost::python::object，我应该改用 static_cast 吗？喜欢unsigned char val = static_cast<unsigned char>(sequence[n]); 【参考方案1】：

可以使用 Boost.Python 注册一个自定义的 from-python 转换器，该转换器处理从 NumPy 数组标量（例如 numpy.uint8）到 C++ 标量（例如 unsigned char）的转换。一个自定义的from-python转换器注册分为三个部分：

检查PyObject 是否可转换的函数。返回NULL 表示PyObject 无法使用已注册的转换器。从PyObject 构造C++ 类型的构造函数。仅当converter(PyObject) 不返回NULL 时才会调用此函数。将被构造的 C++ 类型。

从 NumPy 数组标量中提取值需要几个 NumPy C API 调用：

import_array() 必须在将使用 NumPy C API 的扩展模块的初始化中调用。根据扩展程序使用 NumPy C API 的方式，可能需要满足其他导入要求。 PyArray_CheckScalar() 检查 PyObject 是否为 NumPy 数组标量。 PyArray_DescrFromScalar() 获取数组标量的 data-type-descriptor 对象。数据类型描述符对象包含有关如何解释底层字节的信息。例如，它的 type_num 数据成员包含对应于 C 类型的 enum value。 PyArray_ScalarAsCtype() 可用于从 NumPy 数组标量中提取 C 类型的值。

这是一个完整的示例，演示了使用帮助器类 enable_numpy_scalar_converter 将特定的 NumPy 数组标量注册到其对应的 C++ 类型。

#include <boost/cstdint.hpp>
#include <boost/python.hpp>
#define NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION
#include <numpy/arrayobject.h>

// Mockup functions.

/// @brief Mockup function that will explicitly extract a uint8_t
///        from the Boost.Python object.
boost::uint8_t test_generic_uint8(boost::python::object object)

  return boost::python::extract<boost::uint8_t>(object)();


/// @brief Mockup function that uses automatic conversions for uint8_t.
boost::uint8_t test_specific_uint8(boost::uint8_t value)  return value; 

/// @brief Mokcup function that uses automatic conversions for int32_t.
boost::int32_t test_specific_int32(boost::int32_t value)  return value; 


/// @brief Converter type that enables automatic conversions between NumPy
///        scalars and C++ types.
template <typename T, NPY_TYPES NumPyScalarType>
struct enable_numpy_scalar_converter

  enable_numpy_scalar_converter()
  
    // Required NumPy call in order to use the NumPy C API within another
    // extension module.
    import_array();

    boost::python::converter::registry::push_back(
      &convertible,
      &construct,
      boost::python::type_id<T>());
  

  static void* convertible(PyObject* object)
  
    // The object is convertible if all of the following are true:
    // - is a valid object.
    // - is a numpy array scalar.
    // - its descriptor type matches the type for this converter.
    return (
      object &&                                                    // Valid
      PyArray_CheckScalar(object) &&                               // Scalar
      PyArray_DescrFromScalar(object)->type_num == NumPyScalarType // Match
    )
      ? object // The Python object can be converted.
      : NULL;
  

  static void construct(
    PyObject* object,
    boost::python::converter::rvalue_from_python_stage1_data* data)
  
    // Obtain a handle to the memory block that the converter has allocated
    // for the C++ type.
    namespace python = boost::python;
    typedef python::converter::rvalue_from_python_storage<T> storage_type;
    void* storage = reinterpret_cast<storage_type*>(data)->storage.bytes;

    // Extract the array scalar type directly into the storage.
    PyArray_ScalarAsCtype(object, storage);

    // Set convertible to indicate success. 
    data->convertible = storage;
  
;

BOOST_PYTHON_MODULE(example)

  namespace python = boost::python;

  // Enable numpy scalar conversions.
  enable_numpy_scalar_converter<boost::uint8_t, NPY_UBYTE>();
  enable_numpy_scalar_converter<boost::int32_t, NPY_INT>();

  // Expose test functions.
  python::def("test_generic_uint8",  &test_generic_uint8);
  python::def("test_specific_uint8", &test_specific_uint8);
  python::def("test_specific_int32", &test_specific_int32);

互动使用：

>>> import numpy
>>> import example
>>> assert(42 == example.test_generic_uint8(42))
>>> assert(42 == example.test_generic_uint8(numpy.uint8(42)))
>>> assert(42 == example.test_specific_uint8(42))
>>> assert(42 == example.test_specific_uint8(numpy.uint8(42)))
>>> assert(42 == example.test_specific_int32(numpy.int32(42)))
>>> example.test_specific_int32(numpy.int8(42))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
Boost.Python.ArgumentError: Python argument types in
    example.test_specific_int32(numpy.int8)
did not match C++ signature:
    test_specific_int32(int)
>>> example.test_generic_uint8(numpy.int8(42))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: No registered converter was able to produce a C++ rvalue of type
  unsigned char from this Python object of type numpy.int8

交互使用需要注意的几点：

Boost.Python 能够从numpy.uint8 和int Python 对象中提取boost::uint8_t。 enable_numpy_scalar_converter 不支持促销。例如，test_specific_int32() 接受提升为更大标量类型的numpy.int8 对象应该是安全的，例如int。如果希望进行促销： convertible() 需要检查兼容的 NPY_TYPES construct() 应使用 PyArray_CastScalarToCtype() 将提取的数组标量值转换为所需的 C++ 类型。

【讨论】：

【参考方案2】：

这是已接受答案的稍微通用的版本：https://github.com/stuarteberg/printnum

（转换器是从VIGRA C++/Python 绑定复制而来的。）

接受的答案指出它不支持标量类型之间的转换。此转换器将允许在任何两种标量类型之间进行隐式转换（例如，int32 到 int8，或 float32 到 uint8）。我认为这通常会更好，但这里需要在方便性/安全性之间做出轻微的权衡。

#include <iostream>
#include <boost/python.hpp>

#define NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION

// http://docs.scipy.org/doc/numpy/reference/c-api.array.html#importing-the-api
#define PY_ARRAY_UNIQUE_SYMBOL printnum_cpp_module_PyArray_API
#include <numpy/arrayobject.h>
#include <numpy/arrayscalars.h>


/*
 * Boost python converter for numpy scalars, e.g. numpy.uint32(123).
 * Enables automatic conversion from numpy.intXX, floatXX
 * in python to C++ char, short, int, float, etc.
 * When casting from float to int (or wide int to narrow int),
 * normal C++ casting rules apply.
 *
 * Like all boost::python converters, this enables automatic conversion for function args
 * exposed via boost::python::def(), as well as values converted via boost::python::extract<>().
 *
 * Copied from the VIGRA C++ library source code (MIT license).
 * http://ukoethe.github.io/vigra
 * https://github.com/ukoethe/vigra
 */
template <typename ScalarType>
struct NumpyScalarConverter

    NumpyScalarConverter()
    
        using namespace boost::python;
        converter::registry::push_back( &convertible, &construct, type_id<ScalarType>());
    

    // Determine if obj_ptr is a supported numpy.number
    static void* convertible(PyObject* obj_ptr)
    
        if (PyArray_IsScalar(obj_ptr, Float32) ||
            PyArray_IsScalar(obj_ptr, Float64) ||
            PyArray_IsScalar(obj_ptr, Int8)    ||
            PyArray_IsScalar(obj_ptr, Int16)   ||
            PyArray_IsScalar(obj_ptr, Int32)   ||
            PyArray_IsScalar(obj_ptr, Int64)   ||
            PyArray_IsScalar(obj_ptr, UInt8)   ||
            PyArray_IsScalar(obj_ptr, UInt16)  ||
            PyArray_IsScalar(obj_ptr, UInt32)  ||
            PyArray_IsScalar(obj_ptr, UInt64))
        
            return obj_ptr;
        
        return 0;
    

    static void construct( PyObject* obj_ptr, boost::python::converter::rvalue_from_python_stage1_data* data)
    
        using namespace boost::python;

        // Grab pointer to memory into which to construct the C++ scalar
        void* storage = ((converter::rvalue_from_python_storage<ScalarType>*) data)->storage.bytes;

        // in-place construct the new scalar value
        ScalarType * scalar = new (storage) ScalarType;

        if (PyArray_IsScalar(obj_ptr, Float32))
            (*scalar) = PyArrayScalar_VAL(obj_ptr, Float32);
        else if (PyArray_IsScalar(obj_ptr, Float64))
            (*scalar) = PyArrayScalar_VAL(obj_ptr, Float64);
        else if (PyArray_IsScalar(obj_ptr, Int8))
            (*scalar) = PyArrayScalar_VAL(obj_ptr, Int8);
        else if (PyArray_IsScalar(obj_ptr, Int16))
            (*scalar) = PyArrayScalar_VAL(obj_ptr, Int16);
        else if (PyArray_IsScalar(obj_ptr, Int32))
            (*scalar) = PyArrayScalar_VAL(obj_ptr, Int32);
        else if (PyArray_IsScalar(obj_ptr, Int64))
            (*scalar) = PyArrayScalar_VAL(obj_ptr, Int64);
        else if (PyArray_IsScalar(obj_ptr, UInt8))
            (*scalar) = PyArrayScalar_VAL(obj_ptr, UInt8);
        else if (PyArray_IsScalar(obj_ptr, UInt16))
            (*scalar) = PyArrayScalar_VAL(obj_ptr, UInt16);
        else if (PyArray_IsScalar(obj_ptr, UInt32))
            (*scalar) = PyArrayScalar_VAL(obj_ptr, UInt32);
        else if (PyArray_IsScalar(obj_ptr, UInt64))
            (*scalar) = PyArrayScalar_VAL(obj_ptr, UInt64);

        // Stash the memory chunk pointer for later use by boost.python
        data->convertible = storage;
    
;

/*
 * A silly function to test scalar conversion.
 * The first arg tests automatic function argument conversion.
 * The second arg is used to demonstrate explicit conversion via boost::python::extract<>()
 */
void print_number( uint32_t number, boost::python::object other_number )

    using namespace boost::python;
    std::cout << "The number is: " << number << std::endl;
    std::cout << "The other number is: " << extract<int16_t>(other_number) << std::endl;


/*
 * Instantiate the python extension module 'printnum'.
 *
 * Example Python usage:
 *
 *     import numpy as np
 *     from printnum import print_number
 *     print_number( np.uint8(123), np.int64(-456) )
 *
 *     ## That prints the following:
 *     # The number is: 123
 *     # The other number is: -456
 */
BOOST_PYTHON_MODULE(printnum)

    using namespace boost::python;

    // http://docs.scipy.org/doc/numpy/reference/c-api.array.html#importing-the-api
    import_array();

    // Register conversion for all scalar types.
    NumpyScalarConverter<signed char>();
    NumpyScalarConverter<short>();
    NumpyScalarConverter<int>();
    NumpyScalarConverter<long>();
    NumpyScalarConverter<long long>();
    NumpyScalarConverter<unsigned char>();
    NumpyScalarConverter<unsigned short>();
    NumpyScalarConverter<unsigned int>();
    NumpyScalarConverter<unsigned long>();
    NumpyScalarConverter<unsigned long long>();
    NumpyScalarConverter<float>();
    NumpyScalarConverter<double>();

    // Expose our C++ function as a python function.
    def("print_number", &print_number, (arg("number"), arg("other_number")));

【讨论】：

以上是关于从 numpy.uint8 数组中提取无符号字符的主要内容，如果未能解决你的问题，请参考以下文章

从文件中读取文本到无符号字符数组

如何使用gets()读取无符号字符数组？

套接字：将无符号字符数组从 C 传递到 JAVA

复制无符号字符数组

如何将无符号字符数组存储为浮点值？

在c中分配空间并连接到无符号字符数组