我知道STL有一个HashMap API,但我找不到任何好的和彻底的文档,有关于这方面的好例子。
任何好的例子都将受到赞赏。
我知道STL有一个HashMap API,但我找不到任何好的和彻底的文档,有关于这方面的好例子。
任何好的例子都将受到赞赏。
标准库包括有序和无序map (std::map和std::unordered_map)容器。在有序映射(std::map)中,元素是按键排序的,插入和访问是O(log n).通常标准库内部使用红黑树进行有序映射。但这只是一个实现细节。在一个无序映射(std::unordered_map)中,插入和访问在O(1)中。它只是哈希表的另一个名称。
一个(ordered) std::map的例子:
#include <map>
#include <iostream>
#include <cassert>
int main(int argc, char **argv)
{
std::map<std::string, int> m;
m["hello"] = 23;
// check if key is present
if (m.find("world") != m.end())
std::cout << "map contains key world!\n";
// retrieve
std::cout << m["hello"] << '\n';
std::map<std::string, int>::iterator i = m.find("hello");
assert(i != m.end());
std::cout << "Key: " << i->first << " Value: " << i->second << '\n';
return 0;
}
输出:
23 Key: hello Value: 23
如果你需要在容器中排序,并且可以接受O(log n)运行时,那么就使用std::map。
否则,如果你真的需要一个哈希表(O(1)插入/访问),检查std::unordered_map,它有一个类似于std::map的API(例如,在上面的例子中,你只需要搜索和替换map与unordered_map)。
unordered_map容器是在c++ 11标准修订中引入的。因此,取决于你的编译器,你必须启用c++11特性(例如,当使用GCC 4.8时,你必须在CXXFLAGS中添加-std=c++11)。
甚至在c++ 11发布之前,GCC就在命名空间std::tr1中支持unordered_map -。因此,对于旧的GCC编译器,你可以尝试这样使用它:
#include <tr1/unordered_map>
std::tr1::unordered_map<std::string, int> m;
它也是boost的一部分,也就是说,你可以使用相应的boost-header来获得更好的可移植性。
hash_map是一种较旧的、非标准化的map(最初在TR1中,从c++ 11开始包含在标准中)。顾名思义,它与std::map的主要不同之处在于它是无序的——例如,如果从begin()到end()遍历一个map,则按key1顺序获得项,但如果从begin()到end()遍历一个unordered_map,则以或多或少任意的顺序获得项。
An unordered_map is normally expected to have constant complexity. That is, an insertion, lookup, etc., typically takes essentially a fixed amount of time, regardless of how many items are in the table. An std::map has complexity that's logarithmic on the number of items being stored -- which means the time to insert or retrieve an item grows, but quite slowly, as the map grows larger. For example, if it takes 1 microsecond to lookup one of 1 million items, then you can expect it to take around 2 microseconds to lookup one of 2 million items, 3 microseconds for one of 4 million items, 4 microseconds for one of 8 million items, etc.
From a practical viewpoint, that's not really the whole story though. By nature, a simple hash table has a fixed size. Adapting it to the variable-size requirements for a general purpose container is somewhat non-trivial. As a result, operations that (potentially) grow the table (e.g., insertion) are potentially relatively slow (that is, most are fairly fast, but periodically one will be much slower). Lookups, which cannot change the size of the table, are generally much faster. As a result, most hash-based tables tend to be at their best when you do a lot of lookups compared to the number of insertions. For situations where you insert a lot of data, then iterate through the table once to retrieve results (e.g., counting the number of unique words in a file) chances are that an std::map will be just as fast, and quite possibly even faster (but, again, the computational complexity is different, so that can also depend on the number of unique words in the file).
在创建映射时,顺序由第三个模板参数定义,默认情况下std::less<T>。
下面是一个更完整、更灵活的示例,它没有省略生成编译错误所需的include:
#include <iostream>
#include <unordered_map>
class Hashtable {
std::unordered_map<const void *, const void *> htmap;
public:
void put(const void *key, const void *value) {
htmap[key] = value;
}
const void *get(const void *key) {
return htmap[key];
}
};
int main() {
Hashtable ht;
ht.put("Bob", "Dylan");
int one = 1;
ht.put("one", &one);
std::cout << (char *)ht.get("Bob") << "; " << *(int *)ht.get("one");
}
对于键仍然不是特别有用,除非它们被预定义为指针,因为匹配的值是不行的!(但是,由于我通常使用字符串作为键,在键的声明中用“string”代替“const void *”应该可以解决这个问题。)
在GCC stdlibc++ 6.4中,std::unordered_map使用哈希映射的证据
这在:https://stackoverflow.com/a/3578247/895245上提到过,但在下面的回答中:c++中的std::map内部是什么数据结构?我已经为GCC stdlibc++ 6.4实现提供了进一步的证据:
GDB步骤调试成类 性能特征分析
以下是回答中描述的性能特征图的预览:
如何使用自定义类和哈希函数与unordered_map
这个答案确定了:c++ unordered_map使用自定义类类型作为键
摘录:平等:
struct Key
{
std::string first;
std::string second;
int third;
bool operator==(const Key &other) const
{ return (first == other.first
&& second == other.second
&& third == other.third);
}
};
哈希函数:
namespace std {
template <>
struct hash<Key>
{
std::size_t operator()(const Key& k) const
{
using std::size_t;
using std::hash;
using std::string;
// Compute individual hash values for first,
// second and third and combine them using XOR
// and bit shifting:
return ((hash<string>()(k.first)
^ (hash<string>()(k.second) << 1)) >> 1)
^ (hash<int>()(k.third) << 1);
}
};
}
对于那些试图在使用标准模板的同时找出如何散列我们自己的类的人来说,有一个简单的解决方案:
In your class you need to define an equality operator overload ==. If you don't know how to do this, GeeksforGeeks has a great tutorial https://www.geeksforgeeks.org/operator-overloading-c/ Under the standard namespace, declare a template struct called hash with your classname as the type (see below). I found a great blogpost that also shows an example of calculating hashes using XOR and bitshifting, but that's outside the scope of this question, but it also includes detailed instructions on how to accomplish using hash functions as well https://prateekvjoshi.com/2014/06/05/using-hash-function-in-c-for-user-defined-classes/
namespace std {
template<>
struct hash<my_type> {
size_t operator()(const my_type& k) {
// Do your hash function here
...
}
};
}
因此,为了使用新的哈希函数实现哈希表,你只需要创建一个std::map或std::unordered_map,就像你通常做的那样,并使用my_type作为键,标准库将自动使用你之前(在步骤2中)定义的哈希函数来哈希你的键。
#include <unordered_map>
int main() {
std::unordered_map<my_type, other_type> my_map;
}