#ļøā£ Hashing & Hash Tables
š” Intermediate Core Concept 60 min read
šÆ What is Hashing?
Hashing converts a key into an integer index using a hash function. This index determines where the value is stored in a Hash Table. The magic? O(1) average time for insert, delete, and search operations!
āļø How Hash Tables Work
1. Key ā Hash Function ā Hash Code ā Index
"Alice" ā hash("Alice") ā 2847 ā 2847 % 10 = 7
2. Store at index 7: table[7] = "Alice's Data"
3. Lookup: "Alice" ā hash ā index 7 ā return table[7]
š§ Hash Function Properties
ā Good Hash Function
- Deterministic: Same key ā same hash
- Uniform distribution: Spreads keys evenly
- Fast computation: O(1) to compute
- Avalanche effect: Small key change ā big hash change
ā ļø Collision
When two different keys produce the same hash/index. Inevitable with limited table size (pigeonhole principle). Must be handled!
š Collision Resolution Strategies
1. Chaining (Open Hashing)
// Each bucket is a linked list
// table[5] ā [key1|val1] ā [key2|val2] ā null
// Both key1 & key2 hashed to index 5
class HashTableChaining {
private array $table;
private int $size;
public function __construct(int $size = 10) {
$this->size = $size;
$this->table = array_fill(0, $size, []);
}
private function hash(string $key): int {
$hash = 0;
for ($i = 0; $i < strlen($key); $i++) {
$hash = (($hash << 5) - $hash) + ord($key[$i]);
$hash |= 0; // 32-bit int
}
return abs($hash) % $this->size;
}
public function put(string $key, mixed $value): void {
$index = $this->hash($key);
// Update if key exists
foreach ($this->table[$index] as &$pair) {
if ($pair[0] === $key) { $pair[1] = $value; return; }
}
$this->table[$index][] = [$key, $value];
}
public function get(string $key): mixed {
$index = $this->hash($key);
foreach ($this->table[$index] as $pair) {
if ($pair[0] === $key) return $pair[1];
}
return null;
}
public function remove(string $key): void {
$index = $this->hash($key);
foreach ($this->table[$index] as $i => $pair) {
if ($pair[0] === $key) { unset($this->table[$index][$i]); return; }
}
}
}
2. Open Addressing (Linear Probing)
// If index is occupied, check next slot (index+1, index+2...)
class HashTableProbing {
private array $keys, $values;
private int $size, $count;
public function __construct(int $size = 10) {
$this->size = $size; $this->count = 0;
$this->keys = array_fill(0, $size, null);
$this->values = array_fill(0, $size, null);
}
private function hash(string $key): int {
$h = crc32($key); return abs($h) % $this->size;
}
public function put(string $key, mixed $value): void {
if ($this->count >= $this->size / 2) $this->resize($this->size * 2);
$i = $this->hash($key);
while ($this->keys[$i] !== null && $this->keys[$i] !== $key) {
$i = ($i + 1) % $this->size; // Linear probing
}
if ($this->keys[$i] === null) $this->count++;
$this->keys[$i] = $key; $this->values[$i] = $value;
}
public function get(string $key): mixed {
$i = $this->hash($key);
while ($this->keys[$i] !== null) {
if ($this->keys[$i] === $key) return $this->values[$i];
$i = ($i + 1) % $this->size;
}
return null;
}
private function resize(int $newSize): void { /* ... */ }
}
š Complexity Analysis
| Operation | Average | Worst Case |
|---|---|---|
| Insert | O(1) | O(n) |
| Search | O(1) | O(n) |
| Delete | O(1) | O(n) |
Worst case occurs with many collisions (bad hash function or load factor too high).
š PHP & Hashing
// PHP's associative arrays ARE hash tables!
$map = ['name' => 'John', 'age' => 30];
echo $map['name']; // O(1)
// Check key existence - O(1)
if (array_key_exists('name', $map)) { /* ... */ }
if (isset($map['name'])) { /* ... */ }
// Common hash table patterns
$freq = [];
foreach (['a','b','a','c','b','a'] as $c) {
$freq[$c] = ($freq[$c] ?? 0) + 1;
}
// $freq = ['a'=>3, 'b'=>2, 'c'=>1]
// Two Sum with hash map - O(n)
function twoSum(array $nums, int $target): array {
$seen = [];
foreach ($nums as $i => $num) {
$complement = $target - $num;
if (isset($seen[$complement])) {
return [$seen[$complement], $i];
}
$seen[$num] = $i;
}
return [];
}
š¼ Interview Questions
When two different keys hash to the same index. Must be handled via chaining (linked lists) or open addressing (probing).
Load factor = n/size (filled buckets / total buckets). When it exceeds threshold (typically 0.75), the table resizes to reduce collisions and maintain O(1) performance.