有没有办法知道在tensorflow中调用了哪个c++核心函数？

Posted 2023-02-23

技术标签:

【中文标题】有没有办法知道在tensorflow中调用了哪个c++核心函数？【英文标题】：Is there any way to know which c++ core function is called in tensorflow? 【发布时间】：2020-11-08 02:06:58 【问题描述】：

听说tensorflow是用python封装的，核心功能是用c++实现的。我想知道调用 Python 代码后调用了哪个核心 c++ 函数。有没有办法知道？ tensorflow 分析器仅提供有关 python 函数的信息。谢谢

【问题讨论】：

【参考方案1】：

要获得 C++ 代码，您需要经历 3 个深度级别：Python 实现、包装器和 C++。

如果是 OP（如 Conv / Matmul / ...）

首先，您需要跟踪 python 实现调用的内容。如果你使用一些像 Keras 这样的高级库工具，那可能会很困难。如果您直接在 TF 中调用数学运算（例如 nn.conv2d），会更容易。大多数操作都在tensorflow/python/ops 中实现。比如函数nn.conv2d在tensorflow/python/ops/nn.ops.py中实现。

如您所见，此操作（与大多数操作一样）将工作委托给 gen_nn_ops.conv2d.py。在构建过程中有自动生成的文件，因此除非您愿意检查 bazel 文件并从源代码构建，否则您无法查看此文件。

幸运的是，在我看来，gen_ 文件中可用的函数与.cc 文件中定义的操作之间存在直接映射。

通过调查tensorflow/core/ops/nn_ops.cc你可以找到Conv Op的注册

REGISTER_OP("Conv2D")
    .Input("input: T")
    .Input("filter: T")
    .Output("output: T")
    .Attr("T: half, bfloat16, float, double")
    .Attr("strides: list(int)")
    .Attr("use_cudnn_on_gpu: bool = true")
    .Attr(GetPaddingAttrStringWithExplicit())
    .Attr(GetExplicitPaddingsAttrString())
    .Attr(GetConvnetDataFormatAttrString())
    .Attr("dilations: list(int) = [1, 1, 1, 1]")
    .SetShapeFn(shape_inference::Conv2DShapeWithExplicitPadding);

很遗憾，这个宏只告诉tensorflow有Conv2D这样的操作，却没有说明它应该如何运行。

在 tensorflow 中，Op 指定了需要做什么，但 Kernel 才是真正完成这项工作的那个。您可以通过查找REGISTER_KERNEL_BUILDER 宏来找到可以运行给定操作的内核。它负责将内核与 Op 匹配。对于conv2d，你可以在tensorflow/core/kernels/conv_ops.cc找到一个


#define REGISTER_CPU(T)                                         \
  REGISTER_KERNEL_BUILDER(                                      \
      Name("Conv2D").Device(DEVICE_CPU).TypeConstraint<T>("T"), \
      Conv2DOp<CPUDevice, T>);

// If we're using the alternative GEMM-based implementation of Conv2D for the
// CPU implementation, don't register this EigenTensor-based version.
#if !defined(USE_GEMM_FOR_CONV)
TF_CALL_half(REGISTER_CPU);
TF_CALL_float(REGISTER_CPU);
TF_CALL_double(REGISTER_CPU);
#endif  // USE_GEMM_FOR_CONV

这终于把我们带到了我们正在寻找的东西。内核有计算方法，所以我们对 Conv2DOp::Compute 感兴趣。这是（在同一个文件中定义）：

  void Compute(OpKernelContext* context) override 
    // Input tensor is of the following dimensions:
    // [ batch, in_rows, in_cols, in_depth ]
    const Tensor& input = context->input(0);

    // Input filter is of the following dimensions:
    // [ filter_rows, filter_cols, in_depth, out_depth]
    const Tensor& filter = context->input(1);

    Conv2DDimensions dimensions;
    OP_REQUIRES_OK(context,
                   ComputeConv2DDimension(params_, input, filter, &dimensions));

    TensorShape out_shape = ShapeFromFormat(
        params_.data_format, dimensions.batch, dimensions.out_rows,
        dimensions.out_cols, dimensions.out_depth);

    // Output tensor is of the following dimensions:
    // [ in_batch, out_rows, out_cols, out_depth ]
    Tensor* output = nullptr;
    OP_REQUIRES_OK(context, context->allocate_output(0, out_shape, &output));
    
    ...  // Skipped for clarity

    if (params_.padding != EXPLICIT &&
        LaunchDeepConvOp<Device, T>::Run(
            context, input, filter, dimensions.batch, dimensions.input_rows,
            dimensions.input_cols, dimensions.in_depth, dimensions.filter_rows,
            dimensions.filter_cols, dimensions.pad_rows_before,
            dimensions.pad_cols_before, dimensions.out_rows,
            dimensions.out_cols, dimensions.out_depth, dimensions.dilation_rows,
            dimensions.dilation_cols, dimensions.stride_rows,
            dimensions.stride_cols, output, params_.data_format)) 
      return;
    
    ...

这是旅程的终点。一些操作在这个地方有实际的实现。 Conv2D 不是很令人满意 - 事实证明它将工作委托给LaunchDeepConvOp。如果需要，您可以深入挖掘。

如果不是操作

操作在 TF 中非常特殊。其他代码通过C API链接到python。

C api 可用作c_api.cc 和c_api.h。头文件声明了可用于 python 的 C 函数。源文件 (.cc) 是 C 和 C++ 之间的桥梁 - 它定义了调用相应 C++ 函数的 C 函数（或更准确地说，具有 C 链接的函数）。如果你知道 C 函数，就很容易追踪调用了哪个 C++ 函数。

从 Python 来看，它通常看起来像

# Import
from tensorflow.python import pywrap_tensorflow as c_api

...

# Usage
def get_all_registered_kernels():
  """Returns a KernelList proto of all registered kernels.
  """
  buf = c_api.TF_GetAllRegisteredKernels()

如您所见，名称是匹配的。生成了这个包装器的实现，不用找了。

【讨论】：

以上是关于有没有办法知道在tensorflow中调用了哪个c++核心函数？的主要内容，如果未能解决你的问题，请参考以下文章