C ++中的线性搜索与二进制搜索实时性能

Posted 2023-02-16

技术标签:

【中文标题】C ++中的线性搜索与二进制搜索实时性能【英文标题】：Linear search vs binary search real time performance in C++ 【发布时间】：2021-04-30 08:09:41 【问题描述】：

使用下面的代码比较二分搜索与线性搜索在 C++ 中的实时性能时得到完全出乎意料的结果 -

typedef std::chrono::microseconds us;

int linear_search(uint64_t* val, int s, int e, uint64_t k) 
    while (s < e) 
      if (!less<uint64_t>()(val[s], k)) 
        break;
      
      ++s;
    
    return s;


int binary_search(uint64_t* val, int s, int e, uint64_t k) 
    while (s != e) 
      const int mid = (s + e) >> 1;
      if (less<uint64_t>()(val[mid], k)) 
        s = mid + 1;
       else 
        e = mid;
      
    
    return s;



int main() 

    // Preparing data
    int iter = 1000000;
    int m = 1000;
    uint64_t val[m];
    for(int i = 0; i < m;i++) 
        val[i] = rand();
    
    sort(val, val + m);
    uint64_t key = rand();

    // Linear search time computation
    auto start = std::chrono::system_clock::now();
    for (int i = 0; i < iter; i++) 
        linear_search(val, 0, m - 1, key);
    
    auto end = std::chrono::system_clock::now();
    auto elapsed_us = std::chrono::duration_cast<us>(end - start);
    std::cout << "Linear search: " << m << " values "
              << elapsed_us.count() << "us\n";

    // Binary search time computation
    start = std::chrono::system_clock::now();
    for (int i = 0; i < iter; i++) 
        binary_search(val, 0, m - 1, key);
    
    end = std::chrono::system_clock::now();
    elapsed_us = std::chrono::duration_cast<us>(end - start);
    std::cout << "Binary search: " << m <<" values "
              << elapsed_us.count() << "us\n";

编译不优化，得到以下输出-

Linear search: 1000 values 1848621us
Binary search: 1000 values 24975us

当使用 -O3 优化编译时，得到这个输出 -

Linear search: 1000 values 0us
Binary search: 1000 values 13424us

我知道对于小数组大小，二进制搜索可能比线性搜索要昂贵，但无法通过添加 -O3 来理解这种幅度差异的原因

【问题讨论】：

优化了，对输出做一些事情，这样编译器就不会完全跳过这个块您没有使用线性搜索的结果，因此编译器会在打开优化时删除该部分，令人费解的部分是为什么二进制搜索部分没有发生同样的情况我推荐the compiler explorer，它可以让您查看生成的汇编代码以了解发生了什么。 @largest_prime_is_463035818 val 是由指针获取的，如果变量被引用，编译器不能排除副作用，二进制搜索是最后一个并且在它之后没有对 val 的引用。 @AlessandroTeruzzi 是的，但不是。编译器确实优化了linear_search，它首先出现，两者都使用指针。一旦它确实优化了linear_search，它也可以删除binary_search 【参考方案1】：

我用https://quick-bench.com 对您的代码进行了基准测试，并且二进制搜索要快得多（对于m = 100，它会中断m = 1000）。这是我的基准代码：

int linear_search(uint64_t* val, int s, int e, uint64_t k) 
    while (s < e) 
      if (!std::less<uint64_t>()(val[s], k)) 
        break;
      
      ++s;
    
    return s;


int binary_search(uint64_t* val, int s, int e, uint64_t k) 
    while (s != e) 
      const int mid = (s + e) >> 1;
      if (std::less<uint64_t>()(val[mid], k)) 
        s = mid + 1;
       else 
        e = mid;
      
    
    return s;


constexpr int m = 100;
uint64_t val[m];
uint64_t key = rand();
void init() 
  static bool isInitialized = false;
  if (isInitialized) return;
  for(int i = 0; i < m;i++) 
    val[i] = rand();
  
  std::sort(val, val + m);
  isInitialized = true;


static void Linear(benchmark::State& state) 
  init();
  for (auto _ : state) 
    int result = linear_search(val, 0, m - 1, key);
    benchmark::DoNotOptimize(result);
  

BENCHMARK(Linear);

static void Binary(benchmark::State& state) 
  init();
  for (auto _ : state) 
    int result = binary_search(val, 0, m - 1, key);
    benchmark::DoNotOptimize(result);
  

BENCHMARK(Binary);

结果：

仅对 for (auto _ : state) 中的代码进行基准测试。

【讨论】：

不知道基准。感谢更新代码！【参考方案2】：

编译器设法意识到您的线性搜索是一个 noop（它没有副作用）并将其转换为什么都不做。所以它需要零时间。

要解决此问题，请考虑获取返回值并将其相加，然后将其打印到计时块之外。

【讨论】：

以上是关于C ++中的线性搜索与二进制搜索实时性能的主要内容，如果未能解决你的问题，请参考以下文章

C 调试中的递归线性搜索

数据结构与算法之四搜索算法

AI知识搜索利器：基于ElasticSearch构建专知实时高性能搜索系统

执行异步搜索的问题 (WPF/C#)

ElasticSearch探索之路集群与分片：选举动态更新近实时搜索事务日志段合并