Hash Table
1. Pigeonhole Principle
The Pigeonhole Principle states that if n pigeons are placed into m pigeonholes, with n > m, at least one pigeonhole must hold more than one pigeon.
A hash table is an application of this principle.
2. What is a Hash Table?
A Hash Table is a data structure that maps keys to values using a hash function. It condenses a large address space into a smaller, manageable index range, enabling efficient data retrieval.
An example of a hashing function:
def hash(key):
''' transform/map key to value (index) '''
total = 0
for char in key:
total += ord(char)
return total % N% N confines the index within a valid range. The choice of N should consider the growth of the hash table.
3. Collisions and Resolution Strategies
A collision occurs when two different keys hash to the same index. This can lead to non-optimal performance and requires strategies to handle collisions:
- Linear Probing: Add to the next available
+ilocation. - Quadratic Probing: Add to the next available
+i^2location. - Chaining: Create a linked list for each hash table location; add collided records to the appropriate list.
4. Hash Table with Linear Probing
This implementation resolves collisions using linear probing.
def hash(key):
return key % N
def insert(key, value):
index = hash(key)
if hash_table[index] == -1: # available
hash_table[index] = value # no collision
else: # collision
i = (index + 1) % N
while hash_table[i] != -1: # find next free slot
i = (i + 1) % N # linear probing
hash_table[i] = value # insert collided record
def search(key, target):
index = hash(key)
if hash_table[index] == -1:
return "Record not found"
elif hash_table[index] != target: # collided record
i = (index + 1) % N
while hash_table[i] != -1:
if hash_table[i] == target:
return "Found in collided records"
i = (i + 1) % N
return "Not found in collided records"
else:
return hash_table[index] # non-collided record
# main
N = 10
hash_table = [-1 for i in range(N)]
records = {1234: 'Tom', 651: 'Mary', 238: 'Ali'}
for key, value in records.items():
insert(key, value)
print(hash_table)Output after inserting and searching records will demonstrate linear probing.
5. Hash Table with Quadratic Probing
This implementation resolves collisions using quadratic probing.
def hash(key):
return key % N
def insert(key, value):
index = hash(key)
if hash_table[index] == -1: # available
hash_table[index] = value # no collision
else: # collision
i = (index + 1) % N
k = 2
while hash_table[i] != -1 and k < N:
i = (i + k**2) % N # quadratic probing
k += 1
hash_table[i] = value # insert collided record
# main
N = 10
hash_table = [-1 for i in range(N)]
records = {1234: 'Tom', 651: 'Mary', 238: 'Ali'}
for key, value in records.items():
insert(key, value)
print(hash_table)Quadratic probing helps reduce primary clustering.
6. Hash Table with Chaining
This implementation resolves collisions by using linked lists.
class Node:
def __init__(self, data=None):
self.data = None
self.next = None
def chain_insert(key, value):
index = hash(key)
if hash_table[index].data is None:
hash_table[index].data = value
else:
curr = hash_table[index]
while curr.next:
curr = curr.next
curr.next = Node(value)
# main
N = 10
hash_table = [Node() for i in range(N)]
records = {1234: 'Tom', 651: 'Mary', 238: 'Ali'}
for key, value in records.items():
chain_insert(key, value)
print(chain_search(1234, 'Tom')) # Original record
print(chain_search(7654, 'Alex')) # Collided recordChaining provides a flexible way to handle collisions but uses additional memory.
7. Hashing Function for Strings
A hashing function that computes the sum of ASCII values modulo a number.
def hash(key):
total = 0
for char in key:
total += ord(char)
return total % 13
# main
print(hash("Good morning!"))
print(hash("How are you?"))This demonstrates how strings can be hashed effectively.