哈希表实现原理与代码详解

哈希表的基本概念、哈希函数设计、负载因子及冲突解决方法。详细讲解了直接定址法、除法散列法、乘法散列法等哈希策略，并对比了开放地址法（线性探测）与链地址法的优劣。通过 C++ 模板代码演示了哈希表的完整实现，包括节点状态管理、扩容机制（素数表）、仿函数处理不同 Key 类型以及删除操作的逻辑。

黑客发布于 2026/3/25更新于 2026/4/183 浏览

1. 哈希概念

哈希（hash）又称散列，是一种组织数据的方式。从译名来看有散乱排列的意思，其本质是通过哈希函数将关键字 key 值与存储位置建立映射关系，查找时通过哈希函数计算出 key 值的位置，实现快速查找。

1.1 直接定址法

当关键字 key 值比较集中时，直接定址法是最简单也是最有效的方法。比如一组数据集中在 [0 - 99]，我们开一组大小为 100 的数组即可，数据直接作为下标来定位存储位置。再比如一组数据集中在 [a - z]，我们开一个大小为 26 的数组即可，让数据减去字符'a'的结果来定位存储位置。

例题：387. 字符串中的第一个唯一字符 - 力扣

class Solution {
public:
    int firstUniqChar(string s) {
        int arr[26]={0}; 
        //范围 for 遍历
        for(auto e: s) {
            arr[e-'a']++;
        }
        for (int i = 0; i < s.size(); i++) {
            if (arr[s[i] - 'a'] == 1) {
                return i;
            }
        }
        return -1;
    }
};

1.2 哈希冲突

直接定址法适用于数据比较集中的情况，当数据过于分散时这意味着我们将会开一个过大的数组来存储，而这会极大的浪费内存空间，不可取。

所以当面对这种情况的时候，我们会使用哈希函数 h(key)，通过哈希函数对关键字 key 值在存储空间的映射，来定位 key 值的存储位置。而这里存在一个问题：不同的值通过哈希函数的映射后的位置可能相同，这就会导致哈希冲突（哈希碰撞）。我们可以通过设计一个好的哈希函数来减少哈希冲突，但是在实际情况下，哈希冲突是不可避免的。

1.3 负载因子

假设存储空间的大小为 M，已经放入存储空间的数据个数是 N，那么负载因子就是：N/M。负载因子也称载荷因子。负载因子越大，哈希冲突概率越大，空间利用率越高。负载因子越小，哈希冲突概率越小，空间利用率越低。

1.4 哈希函数

1.4.1 除法散列法/除留余数法

顾名思义就是通过取余，其余值就是映射的存储位置。假设存储空间的大小为 M，关键字为 key，哈希函数 h(key)=key%M。

使用除法散列法时，对 M 有一个要求：要求 M 不能为 2 的幂、10 的幂。假设 M 为 2^3，这相当于直接保留了 2 进制中的后 3 位，那么只要 key 值的后 3 位相同那么就一定冲突，例如：3，11，19，27......，它们二进制的后三位都是 011，所以它们同时取余 M 都等于 3。所以当 M 为 2 的幂时会导致很高的冲突概率，不可取。同理当 M 为 10 的幂时，假设为 10^3，这相当于保留了 10 进制中的后 3 位，那么只要 key 值的后 3 位相同就一定冲突，例如：1001，2001，3001.....，所以也不可取。

相关免费在线工具

加密/解密文本

使用加密算法（如AES、TripleDES、Rabbit或RC4）加密和解密文本明文。在线工具，加密/解密文本在线工具，online

Base64 字符串编码/解码

将字符串编码和解码为其 Base64 格式表示形式即可。在线工具，Base64 字符串编码/解码在线工具，online

Base64 文件转换器

将字符串、文件或图像转换为其 Base64 表示形式。在线工具，Base64 文件转换器在线工具，online

Markdown转HTML

将 Markdown（GFM）转为 HTML 片段，浏览器内 marked 解析；与 HTML转Markdown 互为补充。在线工具，Markdown转HTML在线工具，online

HTML转Markdown

将 HTML 片段转为 GitHub Flavored Markdown，支持标题、列表、链接、代码块与表格等；浏览器内处理，可链接预填。在线工具，HTML转Markdown在线工具，online

JSON 压缩

通过删除不必要的空白来缩小和压缩JSON。在线工具，JSON 压缩在线工具，online

//HashTable.h #include<iostream> #include<vector> using namespace std; //使用开放地址法，实现 HashTable //仅实现探测法 inline unsigned long __stl_next_prime(unsigned long n) { static const int __stl_num_primes = 28; static const unsigned long __stl_prime_list[__stl_num_primes] = { 53, 97, 193, 389, 769, 1543, 3079, 6151, 12289, 24593, 49157, 98317, 196613, 393241, 786433, 1572869, 3145739, 6291469, 12582917, 25165843, 50331653, 100663319, 201326611, 402653189, 805306457, 1610612741, 3221225473, 4294967291 }; const unsigned long* first = __stl_prime_list; const unsigned long* last = __stl_prime_list + __stl_num_primes; const unsigned long* pos = lower_bound(first, last, n); return pos == last ? *(last - 1) : *pos; } //枚举状态 enum STATE { EMPTY, EXIST, DELETE }; template<class K,class V> struct HashData { pair<K, V> _kv; STATE _state=EMPTY; }; template<class K> class HashFunc { public: size_t operator()(const K& key) { return (size_t)key; } }; //偏特化（可以单独写成一个仿函数，但是 string 作为 key 是常有的情况，为避免反复的传仿函数，写成偏特化） template<> class HashFunc<string> { public: size_t operator()(const string& key) { int ch = 0; for (auto& e : key) { ch += e; ch *= 131; //使用 BKDR 哈希的思路：每次结果乘 131，避免'abcd' 'dcba'这种不相同字符，转化为整型却相同的情况 } return ch; } }; template<class K,class V,class Hash = HashFunc<K>> //直接给出缺省值，避免常用的 key 值，反复传仿函数 class HashTable { public: HashTable() :_n(0) , _table(11) {} Hash hash; bool Insert(const pair<K, V>& kv) { //负载因子控制到 0.7 左右 if (_n * 10 / _table.size() >= 7) { size_t size = __stl_next_prime(_table.size()); HashTable newHash; newHash._table.resize(size); for (auto& e : _table) { if(e._state==EXIST) newHash.Insert(e._kv); } _table.swap(newHash._table); } //找位置 size_t pos = hash(kv.first) % _table.size(); size_t i = 1; while (_table[pos]._state!= EMPTY)//探测 { pos = (pos + i) % _table.size(); //处理越界情况 i++; } //插入 _table[pos]._kv = kv; _table[pos]._state = EXIST; _n++; return true; } HashDate<K, V>* Find(const K& key) { size_t pos = hash(key) % _table.size(); size_t i = 1; while (_table[pos]._state != EMPTY) { if (_table[pos]._kv.first == key && _table[pos]._state != DELETE) { return &_table[pos]; } pos = (pos + i)% _table.size(); //处理越界情况 i++; } return nullptr; } //删除不是真正的删除，只需要将其状态修改为 DELETE 即可 bool Erase(const K& key) { HashDate<K, V>* ret = Find(key); if (ret) { ret->_state = DELETE; return true; } else return false; } private: //HashDate 包含：pair、节点的状态 vector<HashData<K, V>> _table; size_t _n;//记录元素个数 }; //test.cpp #include"HashTable.h" //处理 key 为 string 的仿函数，但作为常用的 key 类型，建议直接写成偏特化，避免反复传仿函数 struct StringHashFunc { size_t operator()(const string& key) { int ret = 0; for (auto& e : key) { //采用 BKDR 哈希思路 ret += e; ret *= 131; } return ret; } }; //日期类 (自定义类型要支持 = 符号，用于 Find 函数) struct Date { bool operator==(Date& key) { return _year == key._year && _month == key._month && _day == key._day; } //重点！！！：这里给缺省值，是为了形成默认构造函数 //template<class K, class V> //struct HashDate //{ // pair<K, V> _kv; // STATE _state = EMPTY; //}; //在头文件的 HashDate 函数在初始化时，会调用自定义类型的默认构造！如果默认构造不存在就会报错 Date(const int year = 1, const int month = 1, const int day = 1) { _day = day; _month = month; _year = year; } int _day; int _month; int _year; }; //处理 key 为 Date 的仿函数 struct DateHashFunc { size_t operator()(const Date& key) { //采用 BKDR 哈希思路 int ret = 0; ret += key._day; ret *= 131; ret += key._month; ret *= 131; ret += key._year; ret *= 131; return ret; } }; int main() { //测试 key 为 int 的情况 HashTable<int,int> hash; for (int i = 0; i < 15; i++) { hash.Insert({ i,i }); } cout << hash.Find(10)<<endl; hash.Erase(10); cout << hash.Find(10)<<endl; //测试 key 为 string 的情况 HashTable<string, string,StringHashFunc> hash1; vector<string> arr = { "abc","asd","qwe","fgh" }; for (auto& e : arr) { hash1.Insert({ e,e }); } cout << HashFunc<string>()("abc") <<endl; cout << HashFunc<string>()("asd") << endl; cout << HashFunc<string>()("qwe") << endl; cout << HashFunc<string>()("fgh") << endl; //cout << StringHashFunc()("abc") << endl; //cout << StringHashFunc()("asd") << endl; //cout << StringHashFunc()("qwe") << endl; //cout << StringHashFunc()("fgh") << endl; //测试 key 为自定义的情况 HashTable<Date, Date, DateHashFunc> hash2; Date r1(2025, 4, 1); Date r2(2025, 1, 4); hash2.Insert({ r1, r1 }); hash2.Insert({ r2,r2 }); }

//HashTable.h #include<iostream> #include<vector> using namespace std; //实现链地址法 inline unsigned long __stl_next_prime(unsigned long n) { static const int __stl_num_primes = 28; static const unsigned long __stl_prime_list[__stl_num_primes] = { 53, 97, 193, 389, 769, 1543, 3079, 6151, 12289, 24593, 49157, 98317, 196613, 393241, 786433, 1572869, 3145739, 6291469, 12582917, 25165843, 50331653, 100663319, 201326611, 402653189, 805306457, 1610612741, 3221225473, 4294967291 }; const unsigned long* first = __stl_prime_list; const unsigned long* last = __stl_prime_list + __stl_num_primes; const unsigned long* pos = lower_bound(first, last, n); return pos == last ? *(last - 1) : *pos; } template<class K,class V> struct Node { Node(const pair<K,V>& kv) :_kv(kv) ,next(nullptr) {} Node() {} pair<K, V> _kv; Node* next = nullptr; }; //仿函数（将 key 值转化为正整数） template<class K> struct HashFunc { size_t operator()(const K& key) { return size(key); } }; //偏特化 (处理常用 key 值类型：string) template<> struct HashFunc<string> { size_t operator()(const string& key) { int ret = 0; for (auto& e : key) { //BKDR 哈希的思路 ret += e; ret *= 131; } return ret; } }; template<class K,class V,class Hash = HashFunc<K>> class HashTable { public: //typedef Node<K> Node; using Node = Node<K,V>; HashTable() :_table(__stl_next_prime(0)) ,_n(0) {} ~HashTable() { for (int i = 0; i < _table.size(); i++) { Node* cur = _table[i].next; while (cur) { Node* next = cur->next; delete cur; cur = next; } } } Hash hash; bool Insert(const pair<K, V>& kv) { //不允许键值冗余 if (Find(kv.first)) { return false; } //扩容 (链地址法的负载因子控制在 1 左右) //更好的扩容方法（将原节点移下来，避免不断的开空间） if (_n / _table.size() >= 1) { vector<Node> newtable(__stl_next_prime(_table.size() + 1)); for (int i = 0; i < _table.size(); i++) { Node* cur = _table[i].next; while (cur) { Node* next = cur->next; //映射新表位置 size_t pos = hash(cur->_kv.first) % newtable.size(); //头插到哈希 cur->next = newtable[pos].next; newtable[pos].next = cur; cur = next; } } _table.swap(newtable); } //有点拉 //复用 Insert 函数，再交换，我们不断的创建 Node 节点，需要显示的写出析构函数，避免内存泄露 if (_n / _table.size() >= 1) { HashTable<K, V,Hash> newHash; newHash._table.resize(__stl_next_prime(_table.size() + 1)); for (int i = 0; i < _table.size(); i++) { Node* cur = _table[i].next; while (cur) { //复用 newHash.Insert(cur->_kv); cur = cur->next; } } _table.swap(newHash._table); } //映射位置 size_t hash0 = hash(kv.first) % _table.size(); Node* cur = _table[hash0].next; Node* newNode = new Node(kv); //需要显示写出构造函数 //头插 _table[hash0].next = newNode; newNode->next = cur; //统计元素个数 _n++; return true; } Node* Find(const K& key) { size_t pos = hash(key) % _table.size(); Node* cur = _table[pos].next; while (cur) { if (cur->_kv.first == key) return cur; cur = cur->next; } return nullptr; } bool Erase(const K& key) { //映射位置 size_t pos = hash(key) % _table.size(); Node* cur = _table[pos].next; Node* per = &_table[pos]; //记录 cur 的前节点 while (cur) { if (cur->_kv.first == key) { per->next = cur->next; delete cur; return true; } per = cur; cur = cur->next; } return false; } private: vector<Node> _table; int _n = 0; //记录插入元素个数 }; //test.cpp #include"HashTable.h" struct Date { bool operator==(const Date& key) { return _day == key._day && _month == key._month && _year == key._year; } Date(const int year = 2025,const int month = 4,const int day = 2) : _day(day) ,_month(month) ,_year(year) {} int _day; int _month; int _year; }; struct DateFunc { size_t operator()(const Date& key) { int ret = 0; ret += key._day; ret *= 131; ret += key._month; ret *= 131; ret += key._year; ret *= 131; return ret; } }; int main() { //测试 key 值为 int //HashTable<int, int> Hash0; //Hash0.Insert({ 1,1 }); //Hash0.Insert({ 2,2 }); //Hash0.Insert({ 3,3 }); //Hash0.Insert({ 4,4 }); //cout << Hash0.Find(3) << endl; //cout << Hash0.Erase(3) << endl; //cout << Hash0.Find(3) << endl; //cout << Hash0.Erase(10) << endl; //测试 key 值为 string //HashTable<string, string> Hash0; //Hash0.Insert({ "abc","abc"}); //Hash0.Insert({ "huang","huang"}); //Hash0.Insert({ "yu","yu"}); //Hash0.Insert({ "chi","chi"}); //cout << Hash0.Find("huang") << endl; //cout << Hash0.Erase("huang") << endl; //cout << Hash0.Find("huang") << endl; //cout << Hash0.Erase("asdf") << endl; //测试 key 值为自定义类型 HashTable<Date, Date, DateFunc> Hash0; Date r1({ 2025,4,2 }); Date r2({ 2025,2,4 }); Hash0.Insert({ r1,r1 }); Hash0.Insert({ r2,r2}); cout << Hash0.Find(r2) << endl; cout << Hash0.Erase(r2) << endl; cout << Hash0.Find(r2) << endl; /*cout << Hash0.Erase({ 2000,1,1 }) << endl;*/ }

哈希表实现原理与代码详解

1. 哈希概念

1.1 直接定址法

1.2 哈希冲突

1.3 负载因子

1.4 哈希函数

1.4.1 除法散列法/除留余数法

更多推荐文章

相关免费在线工具

1.4.2 乘法散列法（了解）

1.5 将关键字转化为整型

1.6 解决哈希冲突

1.6.1 开放地址法

线性探测

开放地址法的代码实现

哈希表的基本结构

扩容

如果 key 不能取余的问题

完整代码实现

链地址法

扩容

链地址法代码实现

哈希表实现原理与代码详解

1. 哈希概念

1.1 直接定址法

1.2 哈希冲突

1.3 负载因子

1.4 哈希函数

1.4.1 除法散列法/除留余数法

微信扫一扫，关注极客日志

更多推荐文章

相关免费在线工具

1.4.2 乘法散列法（了解）

1.5 将关键字转化为整型

1.6 解决哈希冲突

1.6.1 开放地址法

线性探测

开放地址法的代码实现

哈希表的基本结构

扩容

如果 key 不能取余的问题

完整代码实现

链地址法

扩容

链地址法代码实现