One of the things I spoke about during my scalability piece was about state models. In a nutshell, you need a means to share state between multiple nodes of your application in your scaled-out environment, whether it be a database or a clustered memory model, a la Terracotta. What I failed to mention was that a clear and thought-out locking strategy also needs to be put in place.
In a concurrent multi-threaded world, using locks to access and modify shared data is common practice. But what happens when you have multiple instances of your application sharing state? Most locks are not effective across JVMs. Neither are the new java concurrency classes such as Semaphores, which are more elegant and configurable constructs than the old synchronized keyword. Terracotta provides its own locking constructs to access shared data. With databases, you have to ensure locks are implemented accurately, to achieve transactional integrity. Testing this out is also tricky and should be handled carefully. Either way, a close look should be taken at your locking model.
Our locking model, fortunately, never came under the microscope when we scaled out our application. Very early on in the development cycle, we hit upon the realization that despite heavy concurrent access to our state store, most accesses would be reads, not writes. This, in turn, led us to write a simple “Optimistic Concurrency Model” as our locking strategy. Essentially, every read from our state store would return a cloned copy of the data. No locks used. The thread was now free to modify the cloned data, without worrying about affecting the flow of any other concurrent thread. If it attempted to write back its new state to the state store, the following rules would be applied:
- It would acquire an exclusive lock to write back into the collection. No other reads or writes to that specific record could happen concurrently.
- If the data object had been updated by another thread in the interim between this one reading it and attempting to write, the write would fail. The thread would then get a cloned copy of the updated state and process the data again, with a high likelihood that a new code path is executed (unless the updated state has no effect to this thread’s execution).
This improved performance tremendously in a single-node scenario within our application and in a multi-threaded scenario, required no new code to re-engineer the locking strategy.
Now, why did this strike me after my talk and not before? Because, I came across this in the latest edition of The Java Specialists’ Newsletter:
“Since Java 5, the language includes support for atomics. Instead of synchronizing access to our fields, we can use atomic references or atomic primitives. Atomics use the Compare-And-Swap approach, supported by hardware (the CMPXCHG instruction on Intel). For example, if you want to do a
++i operation with
AtomicInteger, you would call the
AtomicInteger.addAndGet(int delta) method. This would use an optimistic algorithm that assumes we will not have a conflict. If we do have a conflict, we simply try again. The
addAndGet() method does something like this:
- get the current value and store it in a local variable current
- store the new value (current + delta) in a local variable next
- call compareAndSet with the current value and the next value
- inside the compareAndSet, if the current value matches the value in the AtomicInteger, then return true; otherwise false
- if compareAndSet returns true, we return next; otherwise start from 1.
We can thus have thread safe code without explicit locking.“
You feel vindicated when the language platform adopts a similar approach to solve a similar problem.