Dublin Tech: Consistent Hashing

Consistent Hashing is a clever algorithm that is used in high volume caching architectures where scaling and availability are important. It is used in many high end web architectures for example: Amazon's Dynamo. Let me try and explain it!

Firstly let's consider the problem.

Let's say your website sells books (sorry Amazon but you're a brilliant example). Every book has an author, a price, details such as the number of pages and an ISBN which acts as a primary key uniquely identifying each book. To improve the performance of your system you decide to cache the books. You split the cache over four servers. You have to decide which book to put on which server. You want to do this using a deterministic function so you can be sure where things are. You also want to do this at low computational cost (otherwise what's the point caching). So you hash the book's ISBN and then mod the result by the number of servers which in our case is 4. Let's call this number the book's hash key.

So let's say your books are:

Toward the Light, A.C. Grayling (ISBN=0747592993)
Aftershock, Philippe Legrain (ISBN=1408702231)
The Outsider, Albert Camus (ISBN=9780141182506)
This History of Western Philosophy, Bertrand Russell (ISBN=0415325056)
The life you can save, Peter Singer (ISBN=0330454587)

... etc

After hashing the ISBN and moding the result by 4, let's say the resulting hash keys are:

Hash(Toward the Light) % 4 = 2. Hashkey 2 means this book will be cached by Server 2.
Hash(Aftershock) % 4 = 1. Hashkey 1 means this book will be cached by Server 1.
Hash(The Outsider) % 4 = 4. Hashkey 1 means this book will be cached by Server 4.
Hash(The History of Western Philosophy) % 4 = 1. Hashkey 1 means this book will be cached by Server 1.
Hash(The Life you can save) % 4 = 3. Hashkey 1 means this book will be cached by Server 3.

Oh wow doesn't everything look so great. Anytime we have a book's ISBN we can work out its hash key and know what server its on! Isn't that just so amazing! Well no. Your website has become so cool, more and more people are using it. Reading has become so cool there are more books you need to cache. The only thing that hasn't become so cool is your system. Things are slowing down and you need to scale. Vertical scaling will only get you so far; you need to scale horizontally.

Ok, so you go out and you buy another 2 servers thinking this will solve your problem. You now have six servers. This is where you think the pain will end but alas it won't. Because you now have 6 servers so your algorithm changes. Instead of moding by 4 you mod by 6. What does this mean? Initially, when you look for a book because moding by 6 will mean you'll end up with a different hash key for it and hence a different server to look for it on. It won't be there and you'll have incurred a database read to bring it back into the cache. It's not just one book, it will be the for the majority of your books. Why? Because the only time a book will be on the correct server and not need to be re-read from the database is when the hash(isbn) % 4 = hash(isbn) % 6. Mathematically this will be the minority of your books.

So, your attempt at scaling has put a burden on the majority of your cache to restructure itself resulting in massive database re-reads. This can bring your system down. Customers won't be happy with you sunshine!

We need a solution!

The solution is to come up with a system which has the characteristic that when you add more servers and only a small minority of books will move to new servers meaning a minimum number of database reads. Let's go for it!

Consistent Hashing explained

Consistent hashing is an approach where the books get the same hash key irrespective of the number of books and irrespective of the number of servers - unlike our previous algorithm which mod'ed by the number of servers. It doesn't matter if there is one server, 5 servers or 5 million servers, the books always always always always get the same hash key.

So how exactly do we generate consistent hash values for the books?

Simple. We use a similar approach to our initial approach except we stop moding on the number of servers. Instead we mod by something else, that is constant and independent of the number of servers.

Ok, so let's hash the ISBN as before and then mod by 100. So if you have 1,000 books. You end up with a distribution of hash keys for the books between 0 - 100 irrespective of the number of servers. All good. All we need is a way to figure determinstically and at low computational cost which books reside on which servers. Otherwise again what would be the point in caching?

So here's the ultra funky part... You take something unique and constant for each server (for example its IP address) and you pass that through the exact same algorithm. This means you also end up with a hash key (in this case a number between 0 and 100) for each server.

Let's say:

Server 1 gets: 12
Server 2 gets: 37
Server 3 gets: 54
Server 4 gets: 87

Now we assign each server to be responsible for caching the books with hash keys between its own hash key and that of the next neighbour (next in the upward direction).

This means:

Server 1 stores all the books with hash key between 12 and 37
Server 2 stores all the books with hash key between 37 and 54
Server 3 stores all the books with hash key between 54 and 87
Server 4 stores all the books with hash key between 87 and 100 and 0 and 12.

If you are still with me... great. Because now we are going to scale. We are going to add two more servers. Lets say server 5 is added and gets the hash key 20. And server 6 is added and gets the hash value 70.
This means:

Server 1 will now only store books with hash key between 12 and 20
Server 5 will stores the books with hash key between 20 and 37.
Server 3 will now only store books with hash key between 54 and 70.
Server 6 will stores books with the hash key between 70 and 87.

Server 2 and Server 4 are completly unaffected.

Ok so this means:

All books still get the same hash key. Their hash keys are consistent.
Books with hash keys between 20 and 37 and between 70 and 87 are now sought from new servers.
The first time they are sought they won't be there and they will be re-read from the system and then cached in the respective servers. This is ok as long as it's only for a small amount of books.
There is a small initial impact to the system but its managable.

Now, you're probably saying:
"I get all this but I'd like to see some better distribution. When you added two servers, only two servers got their load lessoned. Could you share the benefits please?"

Of course. To do that, we allocate each server a number of small ranges rather than just one large range. So, instead of server 2 getting one large range between 37 and 54. It gets a number of small ranges. So for example, if could get: 5 - 8, 12 - 17, 24 - 30, 43 - 49, 58 - 61, 71 - 74, 88 - 91. Same for all servers. The small ranges all randomly spread meaning that one server won't just have one adjacent neighbour but a collection of different neighbours for each of its small ranges. When a new server is added it will also get a number if ranges, and number of different neighbours which means its benefits will be distributed more evenly.

Isn't that so cool!

Consistent hashing benefits aren't just limited to scaling. They also are brilliant for availability. Let's say server 2 goes offlines. What happens is the complete opposite to what happens for when a new server is added. Each one of Server 2 segments will become the responsibility of the server who is responsible for the preceeding segment. Again, if servers are getting a fair distribution of ranges they are responsible for it means when a server fails, the burden will be evenly distributed amongst the remaining servers.

Again the point has to emphasised, the books never have to rehashed. Their hashes are consistent.

References

http://www.allthingsdistributed.com/2007/10/amazons_dynamo.html
http://www.tomkleinpeter.com/2008/03/17/programmers-toolbox-part-3-consistent-hashing/
http://michaelnielsen.org/blog/consistent-hashing/

Dublin Tech

Monday, June 20, 2011

Consistent Hashing

No comments:

Post a Comment

Google Profile

Java Code Geek