如何将已编译的协议缓冲区转换回 .proto 文件？

Posted 2023-02-16

技术标签:

【中文标题】如何将已编译的协议缓冲区转换回 .proto 文件？【英文标题】：How to convert a compiled protocol buffer back to .proto file? 【发布时间】：2017-12-06 07:39:10 【问题描述】：

我有一个为 python 2 编译的 google 协议缓冲区，我正在尝试将它移植到 python 3。不幸的是，我无法在任何地方找到用于生成已编译协议缓冲区的 proto 文件。如何恢复 proto 文件以便为 python 3 编译一个新文件。我不知道使用了哪些 proto 版本，我所拥有的只是要在 python 2.6 上运行的 .py 文件。

【问题讨论】：

至少发布python文件内容 @TarunLalwani 不幸的是，文件内容属于机密信息，因此我无法发布。 【参考方案1】：

您必须编写代码（例如在 Python 中）才能遍历消息描述符树。原则上，它们应该包含原始 proto 文件的完整信息，但代码 cmets 除外。并且您仍然拥有的生成的 Python 模块应该允许您将 proto 文件的文件描述符序列化为文件描述符 proto 消息，然后可以将其馈送到将其表示为 proto 代码的代码。

作为指南，您应该查看 protoc 的各种代码生成器，它们实际上执行相同的操作：它们将文件描述符作为 protobuf 消息读取，对其进行分析并生成代码。

这里简单介绍一下如何用 Python 编写 Protobuf 插件

https://www.expobrain.net/2015/09/13/create-a-plugin-for-google-protocol-buffer/

这里是 protoc 插件的官方列表

https://github.com/google/protobuf/blob/master/docs/third_party.md

这是一个用于生成 LUA 代码的 protoc 插件，用 Python 编写。

https://github.com/sean-lin/protoc-gen-lua/blob/master/plugin/protoc-gen-lua

我们来看看主要的代码块

def main():
    plugin_require_bin = sys.stdin.read()
    code_gen_req = plugin_pb2.CodeGeneratorRequest()
    code_gen_req.ParseFromString(plugin_require_bin)

    env = Env()
    for proto_file in code_gen_req.proto_file:
        code_gen_file(proto_file, env,
                      proto_file.name in code_gen_req.file_to_generate)

    code_generated = plugin_pb2.CodeGeneratorResponse()
    for k in  _files:
        file_desc = code_generated.file.add()
        file_desc.name = k
        file_desc.content = _files[k]

    sys.stdout.write(code_generated.SerializeToString())

循环 for proto_file in code_gen_req.proto_file: 实际上循环遍历 protoc 要求代码生成器插件生成 LUA 代码的文件描述符对象。 所以现在你可以这样做：

# This should get you the file descriptor for your proto file
file_descr = your_package_pb2.sometype.GetDescriptor().file
# serialized version of file descriptor
filedescr_msg = file_descr.serialized_pb
# required by lua codegen
env = Env()
# create LUA code -> modify it to create proto code
code_gen_file(filedescr, env, "your_package.proto")

【讨论】：

你能澄清一下你所说的 sometype 是什么意思吗？ sometype 意味着您可以使用原型定义中的任何消息类型。您只需要此类型即可到达您的 proto 文件的 Proto FileDescriptor 对象。可能有一种更直接的方法可以为您的 proto 文件获取 FileDescriptor 实例，但我不知道。【参考方案2】：

如其他帖子中所述，您需要遍历描述符消息树并构建您的 proto 文件内容。

您可以在协议缓冲区github repository 中找到完整的 C++ 示例。以下是链接中的一些 C++ 代码 sn-ps，以便让您了解如何在 Python 中实现它：

  // Special case map fields.
  if (is_map()) 
    strings::SubstituteAndAppend(
        &field_type, "map<$0, $1>",
        message_type()->field(0)->FieldTypeNameDebugString(),
        message_type()->field(1)->FieldTypeNameDebugString());
   else 
    field_type = FieldTypeNameDebugString();
  

  std::string label = StrCat(kLabelToName[this->label()], " ");

  // Label is omitted for maps, oneof, and plain proto3 fields.
  if (is_map() || containing_oneof() ||
      (is_optional() && !has_optional_keyword())) 
    label.clear();
  

  SourceLocationCommentPrinter comment_printer(this, prefix,
                                               debug_string_options);
  comment_printer.AddPreComment(contents);

  strings::SubstituteAndAppend(
      contents, "$0$1$2 $3 = $4", prefix, label, field_type,
      type() == TYPE_GROUP ? message_type()->name() : name(), number());

其中FieldTypeNameDebugString函数如下所示：

// The field type string used in FieldDescriptor::DebugString()
std::string FieldDescriptor::FieldTypeNameDebugString() const 
  switch (type()) 
    case TYPE_MESSAGE:
      return "." + message_type()->full_name();
    case TYPE_ENUM:
      return "." + enum_type()->full_name();
    default:
      return kTypeToName[type()];

【讨论】：

以上是关于如何将已编译的协议缓冲区转换回 .proto 文件？的主要内容，如果未能解决你的问题，请参考以下文章