如何使我的 std::vector 实现更快？ [复制]

Posted 2023-02-21

技术标签:

【中文标题】如何使我的 std::vector 实现更快？ [复制]【英文标题】：How to make my std::vector implementation faster? [duplicate] 【发布时间】：2014-09-29 13:17:37 【问题描述】：

我正在尝试编写 std::vector 的实现来学习 C++，但我的实现比 std::vector 慢（见输出）。

我想知道如何从任何 C++ 专家那里改进它。我看到了这个问题 (Why is std::vector so fast ( or is my implementation is too slow ))，但他的问题没有帮助，因为发布者使用了错误的数据结构。

我在问如何才能比std::vector 更快地获得它。

矢量.h

template <typename T>
class Vector 
public:
    explicit Vector(const int n);
    explicit Vector(const int n, const T& val);
    T& operator[](const int i);
    inline int const length();
    inline void fill(const T& val);
private:
    T* arr;
    int len;
;

矢量.cpp

#include "vector.h"
#include <iostream>
#include <algorithm>

using namespace std;

template <typename T>
inline void Vector<T>::fill(const T& val)

    for (int i = 0; i < len; ++i) 
        arr[i] = val;
    


template <typename T>
inline T& Vector<T>::sum()

    T total = 0;
    for (int i = 0; i < len; ++i) 
        total += arr[i];
    
    return total;


template <typename T>
Vector<T>::Vector(const int n) : arr(new T[n]()), len(n)

    //cout << "Vector(n)" <<'\n';


template <typename T>
Vector<T>::Vector(const int n, const T& val) : arr(new T[n]), len(n)

    //cout << "Vector(n, val)" <<'\n';
    for (int i = 0; i < len; ++i) 
        arr[i] = val;
    


template <typename T>
T& Vector<T>::operator[](const int i)

    return arr[i];


template <typename T>
int const Vector<T>::length()

    return len;


template class Vector<int>;
template class Vector<float>;

vector_test.cpp

#include "vector.h"
#include <iostream>
#include <chrono>
#include <vector>

using namespace std;

int main() 

    const int n = 2000000;
    float sum = 0;
    chrono::steady_clock::time_point start = chrono::steady_clock::now();   
    Vector<float> vec(n, 1);
    sum = vec.sum();
    chrono::steady_clock::time_point end = std::chrono::steady_clock::now();
    cout << "my vec sum = " << sum << '\n';
    cout << "my vec impl took " << chrono::duration_cast<chrono::microseconds>(end - start).count()
              << "us.\n";

    sum = 0;
    start = chrono::steady_clock::now();
    vector<float> vec2(n, 1);
    for (int i = 0; i < n; ++i) 
        sum += vec2[i];
    
    end = std::chrono::steady_clock::now();
    cout << "std::vec sum = " << sum << '\n';
    cout << "stl::vec impl took " << chrono::duration_cast<chrono::microseconds>(end - start).count()
              << "us.\n";

输出：

my vec sum = 2e+06
my vec impl took 11040us.
std::vec sum = 2e+06
stl::vec impl took 8034us.

【问题讨论】：

这个问题，稍加修改（可能，不确定），可能更适合 [code-review.se] (codereview.stackexchange.com) 虽然，祝你好运更快比 std::vector 为什么T& operator[](const int i); 不是inline？没有必要用2个参数制作一个ctor explicit。 Sergey，没必要，但这是个好习惯。如果有人将第二个变量设为默认值怎么办？（至少那是我们所坚持的：D） 1) 您是否进行了优化编译？ 2）当你反转测试时会发生什么（首先是std::vector，然后是你的vector）？ 3) 当你把所有的函数定义放在头文件中会发生什么？ 【参考方案1】：

这是非常幼稚的代码，因为每次迭代都会重新评估索引（您希望优化器将其优化掉）：

for (int i = 0; i < len; ++i) 
    arr[i] = val;

这里有一个更好的方法：

T* ptr = arr;
T* end = ptr + len;
while ( ptr < end ) *ptr++ = val;

不过，好的编译器确实会进行这种转换。

同样的想法可以应用到Sum()：

template <typename T> inline T Vector<T>::sum()

    T* ptr = arr;
    T* end = ptr + len;
    T total = 0;

    while ( ptr < end ) total += *ptr++;

    return total;

【讨论】：

以上是关于如何使我的 std::vector 实现更快？ [复制]的主要内容，如果未能解决你的问题，请参考以下文章