有没有办法缩短这个while条件？

Posted 2023-03-15

技术标签:

【中文标题】有没有办法缩短这个while条件？【英文标题】：Is there a way to shorten this while condition? 【发布时间】：2019-12-01 02:23:25 【问题描述】：

while (temp->left->oper == '+' || 
       temp->left->oper == '-' || 
       temp->left->oper == '*' || 
       temp->left->oper == '/' || 
       temp->right->oper == '+' || 
       temp->right->oper == '-' || 
       temp->right->oper == '*' || 
       temp->right->oper == '/')

    // do something

为了清楚起见：temp 是一个指向下面的node 结构的指针：

struct node

    int num;
    char oper;
    node* left;
    node* right;
;

【问题讨论】：

在不知道temp->left 和temp->right 之间的依赖关系的情况下，您无法在所有相等的运算符中进行优化。从视觉上看，您可以使用正则表达式，但在内部，它可能几乎相同，甚至效率更低。我很想知道你为什么认为你有这个问题。它有点像表达式树的运行时解释，如果是这样的话，还有更好的方法来做到这一点 【参考方案1】：

当然，您可以只使用一串有效的运算符并搜索它。

#include <cstring>

// : :

const char* ops = "+-*/";
while(strchr(ops, temp->left->oper) || strchr(ops, temp->right->oper))

     // do something

如果您关心性能，那么也许是表查找：

#include <climits>

// : :

// Start with a table initialized to all zeroes.
char is_op[1 << CHAR_BIT] = 0;

// Build the table any way you please.  This way using a string is handy.
const char* ops = "+-*/";
for (const char* op = ops; *op; op++) is_op[*op] = 1;

// Then tests require no searching
while(is_op[temp->left->oper] || is_op[temp->right->oper])

     // do something

【讨论】：

必须对strchr 稍加小心，因为（这是一个鲜为人知的事实）如果temp->left->oper 或temp->right->oper 等于'\0'，这也是正确的。但在实践中，这可能是一个很好的解决方案。您可能希望将实现提取到一个单独的函数中。这里是否有理由使用strchr 而不是std::string 和.find()？是的，OP 的目标是缩短循环条件。使用ops.find(temp->left->oper) != std::string::npos 不如使用strchr 短。但当然，正如 cmets 中所指出的，strchr 的行为在搜索 \0 的情况下是不同的，因此使用它也可能被视为潜在错误或实际错误，具体取决于输入。跨度> @paddy 我不认为他在寻找打高尔夫球的答案！只是避免了|| 语句的长链。【参考方案2】：

是的，确实可以！

将有效字符存储到 std::array 或什至是一个普通数组，并对其应用标准算法 std::any_of 以检查条件。

#include <array>     // std::array
#include <algorithm> // std::any_of

static constexpr std::array<char, 4> options '+', '-', '*', '/' ;
const auto tester = [temp](const char c)  return temp->left->oper == c || temp->right->oper == c; ;
const bool isValid = std::any_of(options.cbegin(), options.cend(), tester);

while(isValid) // now the while-loop is simplified to

    // do something

这可以通过打包到一个接受node对象来检查的函数中得到更多的清理。

#include <array>     // std::array
#include <algorithm> // std::any_of

bool isValid(const node *const temp) /* noexcept */

   static constexpr std::array<char, 4> options '+', '-', '*', '/' ;
   const auto tester = [temp](const char c)  return temp->left->oper == c || temp->right->oper == c; ;
   return std::any_of(options.cbegin(), options.cend(), tester);

可以在while-loop中调用

while (isValid(temp)) // pass the `node*` to be checked

    // do something

【讨论】：

不应该是std::any_of 而不是std::all_of 吗？ @displayName tester 也可以是const 和const 的节奏可以保持蜜蜂。 @JeJo：想这样做，但我不确定。这里是 C++ 菜鸟。当字符串文字可以完成工作时不需要数组，当std::string（或std::string_view）有.find()或.find_first_of()时不需要any_of。 // isValid 不应该也是一个 lambda，这样你就可以将它保持在 while 循环的正上方，离使用不远。 ***.com/a/63210165/10063119 // 另外，const node& temp 使用 . 访问权限，而不是 -> 访问权限。你的意思可能是通过const node* temp。 @ankii，当然还有其他方法。你的 sln，看起来非常紧凑和有前途。 isValid 没有必要成为一个 lambda，除非它必须在整个代码库中只使用一次并且不会变成几乎不是内联的。顺便说一句，感谢您指出错字。我已经更正了。【参考方案3】：

创建一个子函数，

bool is_arithmetic_char(char)

// Your implementation or one proposed in another answers.

然后：

while (is_arithmetic_char(temp->left->oper)
    || is_arithmetic_char(temp->right->oper))

    // do something

【讨论】：

请注意，这样的函数也可以添加到node，或从node继承的类的一部分，而不添加任何新的数据成员，而不会破坏其标准布局。我从不相信那种“微重构”。它所做的只是将代码中的任何错误从在它们使用的上下文中可以看到的地方移到你看不到的地方。当然，如果相同的测试出现在应用程序的其他地方，这是一个很好的理由将其排除在外。 @alephzero 在这种情况下，我更喜欢这种子函数，因为（1）如果您想稍后添加运算符，它们会抽象出可能更改的定义（2）它们提供了一个小的可测试接口（3）它们可以记录在头文件中。 @alephzero：如果您还没有，请观看演讲可测试性和良好设计之间的深度协同作用。它可能会给你另一个视角。 @alephzero 用 lambda 回答 ***.com/a/63210165/10063119【参考方案4】：

C 风格：

int cont = 1;
while(cont)
    switch(temp->left->oper) 
    case '+':
    case '-':
    ...
    case '/':
        // Do something
        break;
    default:
        cont = 0;

如果您要声明变量，您可能需要用大括号将// Do something 括起来。

【讨论】：

这不仅仅是“C 风格”。对于这种确切的情况，有很好的理由更喜欢这样做，因为编译器能够比一堆（理论上）不相关的if 检查更有效地构造jump table。即使它不能做到这一点，“我将一堆不同的常量与同一个变量进行比较”也是很好的信息，可以帮助编译器更好地优化。条件改成while (1)会更好；将第一个break 更改为continue；将default 大小写更改为break；并在switch 之后添加另一个中断。我在解释器中多次使用过这种模式。 @T.E.D.我对你的前提持怀疑态度。带有一堆级联案例的 switch 案例与带有析取条件（或一起）的 if 语句没有任何不同。人类可能无法从视觉上识别出这一点，但如果编译器不能平等地支持这两者，我会感到惊讶。 @Alexander 不，GCC 9.1 added jump tables, bit tests and decision trees for switch cases。 TCase1TCase2 @MCCCS 很有趣，感谢您提供。这让我很烦恼。在我看来 switch 和 if/else if/else 是同构的（在所有谓词都检查单个“切换”值的特殊情况下），并且空的级联案例主体是相同的作为谓词的 OR 条件。为什么编译器看不到：|【参考方案5】：

您可以构造一个包含选项的字符串并搜索该字符：

#include <string>

// ...

for (auto ops = "+-*/"s; ops.find(temp-> left->oper) != std::string::npos ||
                         ops.find(temp->right->oper) != std::string::npos;)
    /* ... */;

"+-*/"s 是 C++14 功能。在 C++14 之前使用std::string ops = "+-*/";。

【讨论】：

【参考方案6】：

编程是发现冗余并消除它们的过程。

struct node 
    int num;
    char oper;
    node* left;
    node* right;
;

while (temp->left->oper == '+' || 
       temp->left->oper == '-' || 
       temp->left->oper == '*' || 
       temp->left->oper == '/' || 
       temp->right->oper == '+' || 
       temp->right->oper == '-' || 
       temp->right->oper == '*' || 
       temp->right->oper == '/') 
    // do something

这里的“重复单元”是什么？好吧，我看到了两个

   (something)->oper == '+' || 
   (something)->oper == '-' || 
   (something)->oper == '*' || 
   (something)->oper == '/'

所以让我们把重复的部分分解成一个函数，这样我们只需要写一次。

struct node 
    int num;
    char oper;
    node* left;
    node* right;

    bool oper_is_arithmetic() const 
        return this->oper == '+' || 
               this->oper == '-' || 
               this->oper == '*' || 
               this->oper == '/';
    
;

while (temp->left->oper_is_arithmetic() ||
       temp->right->oper_is_arithmetic()) 
    // do something

哒哒！缩短！（原代码：17行，其中8行是循环条件。修改后的代码：18行，其中2行是循环条件。）

【讨论】：

我不熟悉现代 C++，但没有像['+', '-', '*', '/'].contains(this->oper)? 这样简单的表达方式 @Alexander：不。如果你真的想进一步缩短oper_is_arithmetic()，你可以写return "+-*/"s.find(this->oper) != std::string::npos;——但是Perl风格的汤比简单的旧this->oper == '+' || ...可读性要少。哎呀，ios 世界中有一些 API 使用了这种模式。没有像func contains(substring: String) -> Bool 这样的String 方法，而是func range(of: String) -> NSRange，它返回包含匹配子字符串的索引范围，如果没有找到则返回一个标记值（NSNotFound，类似于std::string::npos）。我非常讨厌这种模式，只是添加一个返回简单布尔值的额外函数真的很难吗？【参考方案7】：

"+" "-" "*" 和 "/" 是 ASCII 十进制值 42、43、45 和 47 因此

#define IS_OPER(x) (x > 41 && x < 48 && x != 44 && x != 46)

while(IS_OPER(temp->left->oper || IS_OPER(temp->right->oper) /* do something */

【讨论】：

我个人建议将此作为辅助函数而不是宏。并且可能还会记录它的作用，以防未来的维护者不熟悉 ASCII 代码点。可以直接写符号为'+'。没有必要记住这些 ASCII 代码。我相信在这种特殊情况下使用 ASCII 码可能会更好，@HolyBlackCat，因为使用范围测试而不是直接比较； x > 41 比 x > ')' 更具可读性，而 x < '0' 必然可以让至少一些人感到意外。（尽管在这种情况下，它应该注意 ASCII 码位的使用，因此它只与 ASCII 系统兼容。）也就是说，我真的很喜欢这个背后的 idea，因为它是一种我不相信编译器经常做的微优化。 ...虽然，如果优化性能是一个问题，最好从x < 48 开始，到maximise the number of short-circuits from the leftmost comparison。我认为这是一个非常好的主意，即使实现有点错误。 ASCII 表顺序旨在促进这种使用。如果条件通常为假，这可能会执行得更好，因为大多数字母将在第二次检查时失败。如果 OP 决定支持括号，那么这甚至可以使用位掩码来完成【参考方案8】：

交易空间与时间，您可以构建两个分别由temp->left->oper 和temp->left->oper 索引的“布尔”数组。满足条件时对应数组为true，否则为false。所以：

while (array1[temp->left->oper] || array1[temp->right->oper]) 
// do something

由于 left 和 right 的集合看起来相同，一个数组实际上可以。

初始化是这样的：

static char array1[256]; // initialized to "all false"

...

array1['+'] = array1['-'] = array1['*'] = array1['/'] = '\001';

array2 类似。由于跳转对于现代流水线 CPU 来说是不利的，因此您甚至可以像这样使用更大的表：

while (array1[temp->left->oper << 8 | temp->right->oper]) 
    // do something

但是初始化比较棘手：

static char array1[256 * 256]; // initialized to "all false"

...

void init(char c) 
    for (unsigned char i = 0; i <= 255; ++i) 
        array1[(c << 8) | i] = array1[(i << 8) | c] = '\001';
    


init('+');
init('-');
init('*');
init('/');

【讨论】：

@Jarod42：我已经很久没有编写 C++ 了，所以我只是从头开始编写它。随意建议编辑更改（例如：编辑！）。【参考方案9】：

正则表达式助你一臂之力！

#include <regex>

while (
    std::regex_match(temp->left->oper, std::regex("[\+\-\*\/]")) ||
    std::regex_match(temp->right->oper, std::regex("[\+\-\*\/]"))
)  
// do something

解释：正则表达式括号 [] 表示正则表达式“字符类”。这意味着“匹配括号内列出的任何字符”。例如，g[eiou]t 将匹配“get”、“git”、“got”和“gut”，但不匹配“gat”。需要使用反斜杠，因为加号 (+) 减号 (-) 和星号 (*) 和正斜杠 (/) 在字符类中具有含义。

免责声明：我没有时间运行此代码；你可能需要调整它，但你明白了。您可能需要将 oper 从 char 声明/转换为 std::string。

参考 1.http://www.cplusplus.com/reference/regex/regex_match/ 2.https://www.rexegg.com/regex-quickstart.html 3.https://www.amazon.com/Mastering-Regular-Expressions-Jeffrey-Friedl/dp/0596528124/ref=sr_1_1?keywords=regex&qid=1563904113&s=gateway&sr=8-1

【讨论】：

这是非常低效的，因为std::regex 对象每次循环迭代最多构造两次，并且正则表达式需要重复编译。将其提取为常量已经有所帮助。【参考方案10】：

将运算符放在 unordered_set 中会非常高效，并且会为运算符提供 O(1) 访问权限。

unordered_set<char> u_set;                                                                                                                                                   
u_set.insert('+');                                                                                                                                                           
u_set.insert('*');                                                                                                                                                           
u_set.insert('/');                                                                                                                                                           
u_set.insert('-');                                                                                                                                                           


if((u_set.find(temp->left->oper) != u_set.end()) || (u_set.find(temp->right->oper) != u_set.end()))      
                 //do something

【讨论】：

在这种情况下，O(1) 将比 O(n) 慢方式。你能解释一下吗？ O 表示法是一种渐近表示法，表示常数被“吸收”。在这种情况下，unordered_set 的开销将大大支配超过 4 个字符的线性搜索的成本。【参考方案11】：

Lambda & `std::string_view`

string_view 提供了std::string 的许多功能，并且可以对文字进行操作，并且它不拥有string。

对高度本地化的代码使用 Lambda 而不是函数，这对文件的其余部分没有用处。此外，当 lambda 可以捕获变量时，无需传递变量。还可以获得inline 的好处，而无需为您要创建的函数指定它。

https://github.com/isocpp/CppCoreGuidelines/blob/master/CppCoreGuidelines.md#Rf-capture-vs-overload

https://github.com/isocpp/CppCoreGuidelines/blob/master/CppCoreGuidelines.md#Rstr-view

制作charconst:

https://github.com/isocpp/CppCoreGuidelines/blob/master/CppCoreGuidelines.md#Rconst-immutable

auto is_arithm = [](const char c) 
  return std::string_view("+-/*").find_first_of(c) != std::string::npos;
;

while (is_arithm(temp->left->oper) || is_arithm(temp->right->oper))

您可以将 const char c 更改为 const node *t 在 lambda 中访问其 oper 成员。但这不是一个好主意，因为temp 的left/right 的成员可以修改。

auto is_arithm2 = [](const node *t) 
  return std::string_view("+-/*").find_first_of(t->oper) != std::string::npos;
;

while(is_arithm2(temp->left) || is_arithm2(temp->right))

【讨论】：

看起来不错。但是，它是not a good idea to take const-& of primitive types。同样在node *temp 的情况下，应该是按值捕获的，因为它只是一个指针，您不需要更改它的内容。 +1，用于string/string_view 方法。感谢您的提示，并点赞！由于答案中提到的原因，我删除了捕获部分。 left/right成员的成员可以通过t修改：t->left = nullptr（不允许），t->left->oper = 'a'（允许）。

以上是关于有没有办法缩短这个while条件？的主要内容，如果未能解决你的问题，请参考以下文章

有没有办法缩短这个while条件？

Lambda & std::string_view

Lambda & `std::string_view`