DirectMemory, in just some months, has gone under three complete rewrites. Why? At first I wanted it powerful, I had a vision in which it become a new end-to-end solutions for all cache related solutions (from heap to off-heap to disk to database) but, in the end, it was a bit too much for a single person and – most of all – it was simply not needed as there were wonderful and proven solutions for all of those problem.
Except, of course, for the off-heap part that (uh-oh!) was the original ispiration. Of course, BigMemory from Terracotta is already here but it’s a paid solution and not everyone can afford it. Of course again there’s memcached that can be used as well and it’s pretty pervasive (just think about it being the cache layer in GAE), but of course it is written in C and has to be installed and managed. Also, it is well known for its performance but, being an external daemon, it imposes some network overhead on your applications. Now, while the first two rewrites tried to address the (self induced) complexity problem the last one tried to concentrate on off-heap, and on being a simpler, embedded alternative to memcached for JVM programmers. Good thing is that, having achieved a low memory foot print even for large quantities of large objects and -honestly- quite good performance even compared to standard heap caches, I think it is now a viable alternative, in some cases, to ehcache, JCS and (yeah someone is still using it!) OSCache.
Now, what’s in the box?
DirectMemory, for the sake of simplicity, exposes one static Cache facade for the lazy programmer in need of a way to temporarily store large quantities of (possibly large as well) objects. The cache facade exposes the (expected) store, retrieve, free and update methods, using strings as keys, objects as payloads and an optional expiresIn value. The lazy programmer doesn’t even need to worry about implementing the Serializable interface in his objects because they are serialized by the wonderful Protostuff library that, simply, doesn’t require that and is several times faster than standard java seralization. DirectMemory starts a separate thread that tries to collect expired entries and, should it be needed, tries to least frequently used ones as well. You can find also a nice dump and Monitor.dump methods that stores usage statistics about buffers, hits and performance and a clean method that allows to simply reset the buffers and start from scratch with all the off-heap allocated memory free to use again. Keep in mind that memory is never released, but it can be just de-allocated and used again, there’s no way to release it to the operating system. Store, retrieve and free methods are complemented by storeByteArray, retrieveByteArray, etc… in case you want to skip serialization (because you want to do it on your own or you need to store binary data loaded from disk or the network or whatever).
While the lazy programmer above could find this enough I also would like to expose some of the internals of DirectMemory. The Cache facade uses Guava Collections to keep key references and relies on memory allocation functionalities exposed by OffHeapMemoryBuffer, which basically is a simple malloc() implementation for the JVM using direct ByteBuffers only for memory allocation and writing to the memory itself, but keeping index data in a collection of Pointer objects. The strategy is simple and effective: at startup DirectMemory allocates one or more large (up to 2gb) direct buffers and then puts a new Pointer with a reference to it (start=0, size=capacity, free=true) in a Pointer’s list. Every time a new value is stored the first free pointer is “sliced” and the new Pointer is stored into the list. When the value gets removed the pointer is simply set as “free” again, ready to be used again or collected by DirectMemory garbage collection methods. There’s few room for concurrency because Pointers get almost never removed from the list, just added. The MemoryManager facade is just a small layer that enables transparent interaction with more than one OffHeapMemoryBuffer object, working around the 2gb limitation for direct memory buffers.
These approach makes using off-heap memory easy and fast and works around ByteBuffers limitations (making frequent calls to ByteBuffer.allocateDirect as shown in the great K.D. Gregory’s ByteBuffers and Non-Heap Memory article can have an impact in heap usage and should adjacent memory be not available allocation can take several seconds as well).
DirectMemory is a young project but not far from the first stable release and it has the the goal to become a useful tool in your programmers’toolbox. Trying DirectMemory is easy, if you use maven (and I think everyone should!). Your feedback is encouraged and appreciated.
- The DirectMemory Cache project on github
- DirectMemory Cache, Pointer and OffHeapMemoryBuffer objects
- ByteBuffers and Non-Heap Memory (by KDGregory)
- DirectMemory Benchmark: Heap vs Off-heap
- Heap vs off-heap micro benchmark (DirectMemory wiki)
- Maximize Memory Use with BigMemory (Terracotta)
- Class ByteBuffer (java 1.4.2)