Powered By Blogger

Monday, July 25, 2011

Why ConcurrentHashMap is better than Hashtable and HashMap ???

ConcurrentHashMap is a pretty ignored class. The class offers a very robust and fast (comparatively, we all know java concurrency isn’t the fastest) method of synchronizing a Map collection.
There is no way you can compare the two, one offers synchronized methods to access a map while the other offers no synchronization whatsoever. What most of us fail to notice is that while our applications, web applications especially, work fine during the development & testing phase, they usually go tits up under heavy (or even moderately heavy) load in PRODUCTION. This is due to the fact that we expect our HashMap’s to behave a certain way but under load they usually misbehave.
Hashtable’s offer concurrent access to their entries, with a small caveat, the entire map is locked to perform any sort of operation. While this overhead is ignorable in a web application under normal load, under heavy load it can lead to delayed response times and overtaxing of your server for no good reason.
This is where ConcurrentHashMap’s step in. They offer all the features of Hashtable with a performance almost as good as a HashMap. ConcurrentHashMap’s accomplish this by a very simple mechanism. Instead of a map wide lock, the collection maintains a list of 16 locks by default, each of which is used to guard (or lock on) a single bucket of the map. This effectively means that 16 threads can modify the collection at a single time (as long as they’re all working on different buckets). Infact there is no operation performed by this collection that locks the entire map. The concurrency level of the collection, the number of threads that can modify it at the same time without blocking, can be increased. However a higher number means more overhead of maintaining this list of locks.
Retrieval operations on a ConcurrentHashMap do not block unless the entry is not found in the bucket or if the value of the entry is null. In such a case the map synchronizes on the bucket and then tries to look for the entry again just in case the entry was put or removed right after the get in synchronized mode.
Removal operations do require a bit of overhead. All removal operations require the chain of elements before and after to be cloned and joined without the removed element. Since the value of the map key is volatile (not really, the value of the inner Entry class is volatile) if a thread already traversing the bucket from which a value is removed reaches the removed element, it automatically sees a null value and knows to ignore such a value.
Traversal in a ConcurrentHashMap does not synchronize on the entire map either. Infact traversal does not synchronize at all except under one condition. The internal LinkedList implementation is aware of the changes to the underlying collection. If it detects any such changes during traversal it synchronizes itself on the bucket it is traversing and then tries to re-read the values. This always insures that while the values recieved are always fresh, there is minimalistic locking if any.
Iteration over a ConcurrentHashMap are a little different from those offered by other collections. The iterators are not fail-fast in the sense that they do not throw a ConcurrentModificationException. They also do not guarantee that once the iterator is created it will list/show all elements that are added after its creation. The iterators do however guarantee that any updates or removal of items will be reflected correctly in their behaviour. They also guarantee that no element will be returned more than once while traversal.
In conclusion, give it a try, replace some Hashtable’s in your application with ConcurrentHashMap and see how they perform under load. The two are interchangeable so it shouldn’t be hard to update your app.

1 comment: