使用自定义类类型作为键的 C++ unordered_map

Posted 2023-02-17

技术标签:

【中文标题】使用自定义类类型作为键的 C++ unordered_map【英文标题】：C++ unordered_map using a custom class type as the key 【发布时间】：2013-06-10 02:34:40 【问题描述】：

我正在尝试使用自定义类作为 unordered_map 的键，如下所示：

#include <iostream>
#include <algorithm>
#include <unordered_map>

using namespace std;

class node;
class Solution;

class Node 
public:
    int a;
    int b; 
    int c;
    Node()
    Node(vector<int> v) 
        sort(v.begin(), v.end());
        a = v[0];       
        b = v[1];       
        c = v[2];       
    

    bool operator==(Node i) 
        if ( i.a==this->a && i.b==this->b &&i.c==this->c ) 
            return true;
         else 
            return false;
        
    
;

int main() 
    unordered_map<Node, int> m;    

    vector<int> v;
    v.push_back(3);
    v.push_back(8);
    v.push_back(9);
    Node n(v);

    m[n] = 0;

    return 0;

但是，g++ 给了我以下错误：

In file included from /usr/include/c++/4.6/string:50:0,
                 from /usr/include/c++/4.6/bits/locale_classes.h:42,
                 from /usr/include/c++/4.6/bits/ios_base.h:43,
                 from /usr/include/c++/4.6/ios:43,
                 from /usr/include/c++/4.6/ostream:40,
                 from /usr/include/c++/4.6/iostream:40,
                 from 3sum.cpp:4:
/usr/include/c++/4.6/bits/stl_function.h: In member function ‘bool std::equal_to<_Tp>::operator()(const _Tp&, const _Tp&) const [with _Tp = Node]’:
/usr/include/c++/4.6/bits/hashtable_policy.h:768:48:   instantiated from ‘bool std::__detail::_Hash_code_base<_Key, _Value, _ExtractKey, _Equal, _H1, _H2, std::__detail::_Default_ranged_hash, false>::_M_compare(const _Key&, std::__detail::_Hash_code_base<_Key, _Value, _ExtractKey, _Equal, _H1, _H2, std::__detail::_Default_ranged_hash, false>::_Hash_code_type, std::__detail::_Hash_node<_Value, false>*) const [with _Key = Node, _Value = std::pair<const Node, int>, _ExtractKey = std::_Select1st<std::pair<const Node, int> >, _Equal = std::equal_to<Node>, _H1 = std::hash<Node>, _H2 = std::__detail::_Mod_range_hashing, std::__detail::_Hash_code_base<_Key, _Value, _ExtractKey, _Equal, _H1, _H2, std::__detail::_Default_ranged_hash, false>::_Hash_code_type = long unsigned int]’
/usr/include/c++/4.6/bits/hashtable.h:897:2:   instantiated from ‘std::_Hashtable<_Key, _Value, _Allocator, _ExtractKey, _Equal, _H1, _H2, _Hash, _RehashPolicy, __cache_hash_code, __constant_iterators, __unique_keys>::_Node* std::_Hashtable<_Key, _Value, _Allocator, _ExtractKey, _Equal, _H1, _H2, _Hash, _RehashPolicy, __cache_hash_code, __constant_iterators, __unique_keys>::_M_find_node(std::_Hashtable<_Key, _Value, _Allocator, _ExtractKey, _Equal, _H1, _H2, _Hash, _RehashPolicy, __cache_hash_code, __constant_iterators, __unique_keys>::_Node*, const key_type&, typename std::_Hashtable<_Key, _Value, _Allocator, _ExtractKey, _Equal, _H1, _H2, _Hash, _RehashPolicy, __cache_hash_code, __constant_iterators, __unique_keys>::_Hash_code_type) const [with _Key = Node, _Value = std::pair<const Node, int>, _Allocator = std::allocator<std::pair<const Node, int> >, _ExtractKey = std::_Select1st<std::pair<const Node, int> >, _Equal = std::equal_to<Node>, _H1 = std::hash<Node>, _H2 = std::__detail::_Mod_range_hashing, _Hash = std::__detail::_Default_ranged_hash, _RehashPolicy = std::__detail::_Prime_rehash_policy, bool __cache_hash_code = false, bool __constant_iterators = false, bool __unique_keys = true, std::_Hashtable<_Key, _Value, _Allocator, _ExtractKey, _Equal, _H1, _H2, _Hash, _RehashPolicy, __cache_hash_code, __constant_iterators, __unique_keys>::_Node = std::__detail::_Hash_node<std::pair<const Node, int>, false>, std::_Hashtable<_Key, _Value, _Allocator, _ExtractKey, _Equal, _H1, _H2, _Hash, _RehashPolicy, __cache_hash_code, __constant_iterators, __unique_keys>::key_type = Node, typename std::_Hashtable<_Key, _Value, _Allocator, _ExtractKey, _Equal, _H1, _H2, _Hash, _RehashPolicy, __cache_hash_code, __constant_iterators, __unique_keys>::_Hash_code_type = long unsigned int]’
/usr/include/c++/4.6/bits/hashtable_policy.h:546:53:   instantiated from ‘std::__detail::_Map_base<_Key, _Pair, std::_Select1st<_Pair>, true, _Hashtable>::mapped_type& std::__detail::_Map_base<_Key, _Pair, std::_Select1st<_Pair>, true, _Hashtable>::operator[](const _Key&) [with _Key = Node, _Pair = std::pair<const Node, int>, _Hashtable = std::_Hashtable<Node, std::pair<const Node, int>, std::allocator<std::pair<const Node, int> >, std::_Select1st<std::pair<const Node, int> >, std::equal_to<Node>, std::hash<Node>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, false, false, true>, std::__detail::_Map_base<_Key, _Pair, std::_Select1st<_Pair>, true, _Hashtable>::mapped_type = int]’
3sum.cpp:149:5:   instantiated from here
/usr/include/c++/4.6/bits/stl_function.h:209:23: error: passing ‘const Node’ as ‘this’ argument of ‘bool Node::operator==(Node)’ discards qualifiers [-fpermissive]
make: *** [threeSum] Error 1

我想，我需要告诉 C++ 如何散列类Node，但是，我不太确定该怎么做。我怎样才能完成这项任务？

【问题讨论】：

third template argument 是您需要提供的哈希函数。 cppreference 有一个简单实用的例子来说明如何做到这一点：en.cppreference.com/w/cpp/container/unordered_map/unordered_map 【参考方案1】：

为了能够将std::unordered_map（或其他无序关联容器之一）与用户定义的键类型一起使用，您需要定义两件事：

一个散列函数；这必须是一个覆盖operator() 并计算给定键类型对象的哈希值的类。一种特别直接的方法是为您的密钥类型专门化 std::hash 模板。

一个相等的比较函数；这是必需的，因为散列不能依赖于散列函数总是为每个不同的键提供唯一的散列值这一事实（即，它需要能够处理冲突），因此它需要一种方法来比较两个给定的键精确匹配。您可以将其实现为覆盖operator() 的类，或作为std::equal 的特化，或者——最简单——通过为您的键类型重载operator==()（正如您已经做过的那样）。

散列函数的困难在于，如果您的键类型由多个成员组成，您通常会让散列函数计算各个成员的散列值，然后以某种方式将它们组合成整个对象的一个散列值。为了获得良好的性能（即很少发生冲突），您应该仔细考虑如何组合各个散列值，以确保避免过于频繁地为不同的对象获得相同的输出。

散列函数的一个相当好的起点是使用位移和按位异或来组合各个散列值。例如，假设一个键类型是这样的：

struct Key

  std::string first;
  std::string second;
  int         third;

  bool operator==(const Key &other) const
   return (first == other.first
            && second == other.second
            && third == other.third);
  
;

这是一个简单的哈希函数（改编自cppreference example for user-defined hash functions中使用的那个）：

namespace std 

  template <>
  struct hash<Key>
  
    std::size_t operator()(const Key& k) const
    
      using std::size_t;
      using std::hash;
      using std::string;

      // Compute individual hash values for first,
      // second and third and combine them using XOR
      // and bit shifting:

      return ((hash<string>()(k.first)
               ^ (hash<string>()(k.second) << 1)) >> 1)
               ^ (hash<int>()(k.third) << 1);
    
  ;

有了这个，你可以为键类型实例化一个std::unordered_map：

int main()

  std::unordered_map<Key,std::string> m6 = 
     "John", "Doe", 12, "example",
     "Mary", "Sue", 21, "another"
  ;

它将自动使用上面定义的std::hash<Key> 进行哈希值计算，并使用operator== 定义为Key 的成员函数进行相等检查。

如果您不想在 std 命名空间内专门化模板（尽管在这种情况下它是完全合法的），您可以将哈希函数定义为一个单独的类并将其添加到映射的模板参数列表中：

struct KeyHasher

  std::size_t operator()(const Key& k) const
  
    using std::size_t;
    using std::hash;
    using std::string;

    return ((hash<string>()(k.first)
             ^ (hash<string>()(k.second) << 1)) >> 1)
             ^ (hash<int>()(k.third) << 1);
  
;

int main()

  std::unordered_map<Key,std::string,KeyHasher> m6 = 
     "John", "Doe", 12, "example",
     "Mary", "Sue", 21, "another"
  ;

如何定义更好的哈希函数？如上所述，定义一个好的散列函数对于避免冲突和获得良好的性能很重要。对于一个真正好的结果，您需要考虑所有字段的可能值的分布，并定义一个散列函数，将该分布投影到尽可能广泛且均匀分布的可能结果空间。

这可能很困难；上面的 XOR/位移位方法可能不是一个糟糕的开始。为了更好的开始，您可以使用 Boost 库中的 hash_value 和 hash_combine 函数模板。前者的作用类似于std::hash 用于标准类型（最近还包括元组和其他有用的标准类型）；后者可帮助您将各个散列值合并为一个。这是使用 Boost 辅助函数的哈希函数的重写：

#include <boost/functional/hash.hpp>

struct KeyHasher

  std::size_t operator()(const Key& k) const
  
      using boost::hash_value;
      using boost::hash_combine;

      // Start with a hash value of 0    .
      std::size_t seed = 0;

      // Modify 'seed' by XORing and bit-shifting in
      // one member of 'Key' after the other:
      hash_combine(seed,hash_value(k.first));
      hash_combine(seed,hash_value(k.second));
      hash_combine(seed,hash_value(k.third));

      // Return the result.
      return seed;
  
;

这是一个不使用 boost 的重写，但使用了组合哈希的好方法：

namespace std

    template <>
    struct hash<Key>
    
        size_t operator()( const Key& k ) const
        
            // Compute individual hash values for first, second and third
            // http://***.com/a/1646913/126995
            size_t res = 17;
            res = res * 31 + hash<string>()( k.first );
            res = res * 31 + hash<string>()( k.second );
            res = res * 31 + hash<int>()( k.third );
            return res;
        
    ;

【讨论】：

你能解释一下为什么需要移动KeyHasher中的位吗？如果你没有移动位并且两个字符串是相同的，异或会导致它们相互抵消。所以 hash("a","a",1) 将与 hash("b","b",1) 相同。顺序也无关紧要，因此 hash("a","b",1) 将与 hash("b","a",1) 相同。我只是在学习 C++，而我一直在纠结的一件事是：将代码放在哪里？正如你所做的那样，我已经为我的密钥编写了一个专门的 std::hash 方法。我将它放在 Key.cpp 文件的底部，但出现以下错误：

Error 57 error C2440: 'type cast' : cannot convert from 'const Key' to 'size_t'	c:\program files (x86)\microsoft visual studio 10.0\vc\include\xfunctional

。我猜编译器没有找到我的哈希方法？我应该在我的 Key.h 文件中添加任何内容吗？ @Ben 将其放入 .h 文件中是正确的。 std::hash 实际上并不是一个结构体，而是一个结构体的模板（特化）。所以它不是一个实现——当编译器需要它时，它会变成一个实现。模板应始终进入头文件。另请参阅***.com/questions/495021/… @nightfury find() 返回一个迭代器，该迭代器指向地图的“入口”。条目是由键和值组成的std::pair。因此，如果您执行auto iter = m6.find("John","Doe",12);，您将获得iter->first 中的键和iter->second 中的值（即字符串"example"）。如果你想要直接的字符串，你可以使用m6.at("John","Doe",12)（如果键不存在会抛出异常），或者m6["John","Doe",12]（如果键不存在会创建一个空值）。 【参考方案2】：

我认为，jogojapan 给出了一个非常好的和详尽的answer。在阅读我的帖子之前，您绝对应该先看看它。但是，我想添加以下内容：

unordered_map

operator==

Node

unordered_map

总而言之，对于您的Node 类，代码可以编写如下：

using h = std::hash<int>;
auto hash = [](const Node& n)return ((17 * 31 + h()(n.a)) * 31 + h()(n.b)) * 31 + h()(n.c);;
auto equal = [](const Node& l, const Node& r)return l.a == r.a && l.b == r.b && l.c == r.c;;
std::unordered_map<Node, int, decltype(hash), decltype(equal)> m(8, hash, equal);

注意事项：

我只是在 jogojapan 的答案末尾重用了哈希方法，但您可以找到更通用的解决方案 here 的想法（如果您不想使用 Boost）。我的代码可能有点太小了。如需更易读的版本，请参阅this code on Ideone。

【讨论】：

8 是从哪里来的，是什么意思？ @WhalalalalalalaCHen：请看documentation of the unordered_map constructor。 8 代表所谓的“桶数”。桶是容器内部哈希表中的一个槽，参见例如unordered_map::bucket_count 了解更多信息。 @WhalalalalalalaCHen：我随机选择了8。根据您要存储在 unordered_map 中的内容，桶数可能会影响容器的性能。【参考方案3】：

使用自定义类作为unordered_map 的键的最基本可能的复制/粘贴完整可运行示例（稀疏矩阵的基本实现）：

// UnorderedMapObjectAsKey.cpp

#include <iostream>
#include <vector>
#include <unordered_map>

struct Pos

  int row;
  int col;

  Pos()  
  Pos(int row, int col)
  
    this->row = row;
    this->col = col;
  

  bool operator==(const Pos& otherPos) const
  
    if (this->row == otherPos.row && this->col == otherPos.col) return true;
    else return false;
  

  struct HashFunction
  
    size_t operator()(const Pos& pos) const
    
      size_t rowHash = std::hash<int>()(pos.row);
      size_t colHash = std::hash<int>()(pos.col) << 1;
      return rowHash ^ colHash;
    
  ;
;

int main(void)

  std::unordered_map<Pos, int, Pos::HashFunction> umap;

  // at row 1, col 2, set value to 5
  umap[Pos(1, 2)] = 5;

  // at row 3, col 4, set value to 10
  umap[Pos(3, 4)] = 10;

  // print the umap
  std::cout << "\n";
  for (auto& element : umap)
  
    std::cout << "( " << element.first.row << ", " << element.first.col << " ) = " << element.second << "\n";
  
  std::cout << "\n";

  return 0;

【讨论】：

【参考方案4】：

对于枚举类型，我认为这是一种比较合适的方式，类之间的区别在于如何计算哈希值。

template <typename T>
struct EnumTypeHash 
  std::size_t operator()(const T& type) const 
    return static_cast<std::size_t>(type);
  
;

enum MyEnum ;
class MyValue ;

std::unordered_map<MyEnum, MyValue, EnumTypeHash<MyEnum>> map_;

【讨论】：

【参考方案5】：

STL 不提供对的散列函数。您需要自己实现它并指定为模板参数或放入命名空间std，它将被自动拾取。关注https://github.com/HowardHinnant/hash_append/blob/master/n3876.h 对于为结构实现自定义散列函数非常有用。更多细节在这个问题的其他答案中有很好的解释，所以我不会重复。 Boost中也有类似的东西（hash_combine）。

【讨论】：

【参考方案6】：

查看以下链接https://www.geeksforgeeks.org/how-to-create-an-unordered_map-of-user-defined-class-in-cpp/了解更多详情。

自定义类必须实现 == 运算符必须为类创建一个哈希函数（对于像 int 这样的原始类型以及像 string 这样的类型，哈希函数是预定义的）

【讨论】：

以上是关于使用自定义类类型作为键的 C++ unordered_map的主要内容，如果未能解决你的问题，请参考以下文章

使用自定义类类型作为键的 C++ unordered_map

C++ 哈希表 - 如何解决 unordered_map 与自定义数据类型作为键的冲突？

在普通键的情况下使用map over unordered_map有什么好处吗？

如果自定义类型从 C++ 作为 const 传递，如何使用 QML 注册自定义类型