The Generic Open Addressing Hash Table Library
in C implements a high-performance hash table using linear probing for collision resolution. It efficiently stores and retrieves key-value pairs with minimal overhead, leveraging the FNV-1a (Fowler-Noll-Vo)
hash algorithm to compute hash values. This data structure is particularly suited for applications requiring fast lookups, insertions, and deletions with a focus on simplicity and scalability.
Index | Key | Value |
---|---|---|
0 | "key1" | "value1" |
1 | "key2" | "value2" |
2 | "key3" | "value3" |
3 | "key4" | "value4" |
4 | "key5" | "value5" |
- 🛠️ Initialization of hash table
- ➕ Insertion of key-value pairs
- 🔍 Retrieval of values by key
- ❌ Deletion of key-value pairs
- 🔑 Check if a key exists in the hash table
- 🔄 Auto-resizing of the hash table
- 🧹 Freeing of the hash table
- 📏 Getting the size of the hash table (number of slots)
- 🔢 Getting the count of elements in the hash table
To use this library, you can either include the hashtable.h
header file locally or install it system-wide using the provided Makefile
.
-
Clone the repository
git clone https://github.com/pr00x/hashtable-c.git
-
Include the header file in your project:
#include "hashtable.h"
-
Compile your program along with the hash table source file:
gcc myprogram.c hashtable.c -o myprogram
-
Clone the repository:
git clone https://github.com/pr00x/hashtable-c.git
-
Build and install the library and header file:
sudo make install
This installs the library to
/usr/local/lib
and the header to/usr/local/include
. -
Compile your program by linking to the installed library:
gcc myprogram.c -o myprogram -lhashtable
To remove the installed library and header files, run:
sudo make uninstall
To remove the compiled library and header files from the source directory, run:
sudo make clean
Uninstall and clean in one command:
sudo make clean uninstall
#include <stdio.h>
#include "hashtable.h" // or <hashtable.h> if it's installed
int main(void) {
HashTable *ht = ht_init(10);
int year = 2024;
ht_set(ht, "lion", "savannah");
ht_set(ht, "tiger", "forest");
ht_set(ht, "elephant", "grassland");
ht_set(ht, "year", &year);
ht_set(ht, "penguin", "antarctica");
ht_set(ht, "kangaroo", "australian outback");
printf("The lion lives in the: %s\n", (const char *)ht_get(ht, "lion"));
printf("Current year: %d\n", *(int *)ht_get(ht, "year"));
ht_set(ht, "lion", "zoo");
printf("The updated lion lives in the: %s\n", (const char *)ht_get(ht, "lion"));
if(ht_has(ht, "tiger")) {
puts("The tiger is in the hash table!");
}
else {
puts("The tiger is not in the hash table.\n");
}
ht_delete(ht, "penguin");
puts("Deleted the penguin entry.");
if(ht_has(ht, "penguin")) {
puts("The penguin is still in the hash table!");
}
else {
puts("The penguin has been deleted from the hash table.");
}
printf("Size of hash table: %zu\n", ht_size(ht));
printf("Count of animals in hash table: %zu\n", ht_count(ht));
// Free the memory allocated for the hash table
ht_free(&ht);
return 0;
}
The Generic Open Addressing Hash Table is designed to offer high performance for common operations, including insertion, deletion, and lookup, while minimizing memory overhead. Its efficiency is mainly determined by the hash function, collision resolution method, and dynamic resizing mechanism. Below is a summary of how these factors contribute to the overall performance:
-
Insertion: O(1) on average. If the table is not full and there are no collisions, new elements are inserted in constant time. However, if a collision occurs, linear probing is used to resolve it, but the average time complexity remains O(1) due to the low load factor.
-
Lookup: O(1) on average. The key-value pair can be retrieved in constant time if no collisions or minimal probing is needed. The hash function computes the index quickly, and if collisions occur, the linear probing allows for quick access to the correct entry.
-
Deletion: O(1) on average. Deletion of an element is done in constant time, with linear probing employed to find the key. Once the key is found, it is removed, and the table is reorganized if necessary to maintain the integrity of the hash table.
-
Resizing: O(n) during rehashing. When the load factor exceeds 0.7, the hash table is resized (doubled), and all elements are rehashed to their new positions. This operation takes O(n) time but happens infrequently (only when resizing occurs).
The hash table is space-efficient, using a simple array of slots for storage. The only additional memory overhead comes from the storage of keys and values. The resizing process involves creating a new, larger array, but this helps ensure the table can scale efficiently as more elements are added.
Collisions are resolved using linear probing, which ensures that all elements remain in contiguous slots. While this method is simple and effective, it may cause performance degradation as the table becomes more crowded. To maintain efficiency, the table automatically resizes when the load factor exceeds 0.7.
While some operations, like resizing and rehashing, may temporarily increase the cost of operations, these events happen infrequently. Over many operations, the average cost of insertion, deletion, and lookup remains constant, making the table highly efficient for long-term use.
The hash table is scalable, meaning it can efficiently grow as the number of stored elements increases. The resizing mechanism ensures that the hash table can handle large datasets without significant performance loss.
By maintaining a low load factor and resizing dynamically, the hash table can handle a high number of elements while keeping operations efficient, even under heavy load.
The hash table is generic, allowing you to store any data type as values. Whether it's an integer, float, struct, or even another hash table, the hash table can handle it! The values are stored as void *
pointers, meaning the hash table is flexible and can accommodate all kinds of data types.
Strings are Immutable: The keys in the hash table must be strings (const char *), and they should remain immutable. This ensures that the keys do not change during their lifetime, preventing issues during lookups or resizing. Modifying the strings after insertion may lead to undefined behavior.
Store Any Data Type: The values stored in the hash table can be any data type, provided that they are passed as pointers. Whether you’re storing simple types like integers or complex structures, the hash table can accommodate them all, making it a powerful tool for a wide range of applications.
Contributions are welcome! Please create a pull request or submit an issue for any feature requests or bug reports.
This project is licensed under the MIT License. See the LICENSE
file for details.