Welcome to another lesson from NCP!! I hope you guys are doing well 😊. We all know what a hash is, right? It looks something like this:
Yeah…..well this is not the hash we will be talking about today. The hash we are discussing here is something completely different.
Hash Table is a data structure that every programmer should know about, because if you turn up for an interview and “What is a hash table?” is the question that could get you the job then being a Noob Code Pro follower, you should be able to nail that question. In order to do that, all you need to do is go through this article and……that’s it ;)
WHAT ARE HASH TABLES?
A Hash Table is a data structure that stores element in a series of slots or buckets. Each slot has a unique identifying key. Hash tables include a number of slots equal to a prime number.
For example, let’s say we have a collection of 5 integers to store:
86 6 22 38 68
For such a handful of numbers, we don’t need a large hash table, one with 7 slots should be just fine. Why 7? Remember I mentioned above that a hash table must have slots equal to a prime number, 7 is the closest prime number that can store 5 integers and still have room for more. Don’t worry, my friend!! I’ll explain why a hash table should have slots equal to a prime number.
Each slot is numbered, the first slot is 0 and the last slot is 6 and we can decide which slot to put each number in by performing a hash.
Now you can also put each integer in each slot starting from 0 to 4, so the integer ‘86’ will be stored in slot 0 and ‘68’ will be stored in slot 4. That should do the work, right? Yes, but this way you have created a python list. Accessing elements from a list requires a linear search and accessing elements from a hash table simply requires you to perform a hash and you will find the slot in which the value you are looking for is stored. In other words, looking for an element in a hash table is way quicker than looking for an element in a python list. Handling just five integers really is not a good way to test how efficient a hash table can be over a list, but when you are working with large data, hash tables make an enormous difference.
A hash takes a number or a group of numbers and performs an arithmetic or logical transformation to produce a value. For example, modulo(%) division is a very simple hashing technique.
HOW TO ADD VALUES TO A HASH TABLE?
To hash the number ‘68’, we can perform a modulo division by 7 (the number of slots). The result ‘5’ indicates the slot for the number ‘68’ in the hash table.
0 1 2 3 4 5 6
68
In case of a list, it would have been:
0 1 2 3 4 5 6
68
In this case, if we want to search ‘68’, python performs a linear search (something we talked about in a previous lesson), starting from 0.
But when we create a hash table, all we need to do to find the slot number, is perform a modulo division of the number by the number of slots, i.e. 68 % 7 = 5 and there you go, you can see that ‘68’ is stored in slot 5.
WHAT HAPPENS IF A VALUE ALREADY EXISTS IN A SLOT?
Hash Tables are not perfect. A collision is a state when an element is to be added and hashed but the slot number is already occupied by another element. For example, in the above example, if we want to add ‘45’, 45 % 7 = 3, but slot 3 is occupied by ‘38’(because 38 % 7 = 3).
To resolve a collision, the new element can be simply assigned to the next available slot. This conflict increases the search time of the element, because you will have to hash the value to find its slot number and then realize that it is not the value you are looking for and look at the next available slot and find your value there.
The goal of hash table construction should be to use the correct number of slots and hash function that produces the fewest number of collisions.
NOTE: Hash tables should be the length of a prime number because prime numbers have few common factors, which prevent hashed values from clumping around common factors such as 2 and 3.
CONCLUSION:
Hash Tables are way more efficient than a python list when dealing with large data. But hash tables have their flaws as well, a collision brings out most of them. Having a prime number of slots is just a way to minimize the number of collisions that may occur.
Do you know any other way of handling collisions and minimizing it? Share it with us in the comments below! If you found this lesson on ‘Hash Tables’ helpful, then do share it with your friends who love programming like you do. Leaving a like would be another way to show some appreciation ;)
Have some queries or questions? You can always find me in the comments section, telegram channel or my Pinterest profile where you can personally talk to me and ask me questions about anything we have learnt so far.
If you are looking to join a community of programmers, you can join Noob Code Pro’s official telegram channel and be among programmers like you for free.
Stay tuned for another article next week, same time, where we will discuss about a new topic/concept in programming, what they are, how they work and where you use them. More cool stuff coming your way, DON’T MISS IT !! And I'll see you next week. Goodbye and Good Luck :)
I hope this article answered all of your questions and even helped you in becoming a better programmer. FOLLOW NOOB CODE PRO TO BECOME A PROFESSIONAL PYTHON PROGRAMMER FROM A TOTAL BEGINNER.
HOPE YOU HAVE AN AWESOME DAY AHEAD !!!

0 Comments
Welcome to the comments section, this is where you can contact me personally for any doubts or feedback