Sunday, October 17, 2010

BigMemory: Followup Q and A

I got quite a few responses to my post on BigMemory in the Terracotta Server. It seems like people are quite confused on what it actually is.

Here's some answers to a few questions I received:

1. Why can't they (Terracotta) put garbage collector on another cpu core and gain performance?

I think there is a misunderstanding about the cost of Garbage Collection. The Full GC pause (which is when all application threads are paused) is what the GC problem in Java is all about. It is tolerable when your Heap is 1-2 GB. But anything beyond that you get 4,5,8 seconds GC pauses. Besides, if you don't run ParallelGC then it will use one core anyway. But you DO want to have your garbage collector using all the cores so it will complete faster and have less pauses.

2. (In References to the question above) Then put it on another thread and how about pausing one thread at a time ?

Again this is not possible AFAIK to do with the Sun/Oracle JVM. Also, Full GC Pauses are a necessary evil for the GC algorithm they are using. Even if this was possible, it would not solve the problem of unpredictability.

3. I can't believe there are no GC pauses ... or you guys might have made memory management solution like an OS in java.

The idea of have direct memory allocation in Java is no big secret. There is an -XX:MaxDirectMemorySize flag to tell the JVM how much direct memory to allocate.  The value add of Terracotta is to use this direct memory space in a way that is fast and does got fragment.

4. Using direct memory allocated by the JVM is useless because it is so much slower then the Heap.

Access to direct memory is NOT slower than Heap. There are two things that contribute to the perceived slowness of direct memory. Serializing and deserializing data to and from direct memory; and allocating and cleaning up direct memory buffers. At Terracotta we solved the direct memory and cleanup problem. On the Terracotta Server we don't pay for the serialization/deserialization cost. On Enterprise Ehcache (unclustered) we do pay a serialization/deserialization cost, but compare this CPU cost to having to deal with Full GC Pauses on the Heap. The tradeoff is well worth it. Besides BigMemory using the Heap as part tier storage strategy; Heap to OffHeap to Disk. It's an age old principle in computer science (think Virtual Memory). We avoid the serialization/deserialization cost for frequently used objects by having those in Heap, then having a big part of your cache on OffHeap to avoid long FullGC and the rest spilling over to disk.

For the additional CPU cost what you get in return is predictable latency and speed with all the memory your Java process desires. Find an app where you do see Full GC pauses and checkout the beta to see for yourself.

1 comment:

Maxheadroom123 said...

If you’re looking for Managed IT Services in NYC, Etech7 is the right place for you. I’m a dummy in IT sphere, so all the time I need a good company to solve my IT problems. Finally I’ve found Etech7 and when I need IT Consulting, I’ll definitely call them.