Blogface.org: November 2008

Monday, November 24, 2008

The Atomic Nugget Design Pattern: Part II

The Atomic Nugget design pattern is a way of encapsulating some action that a client wants to perform atomically with respect to a provider class. The idea is that the client will write a "first-class function," in the form of an anonymous inner class, and then give that action to the provider, who will then internally acquire and release the necessary synchronization. In order to make this work, we will need an interface for our first-class functions:

public interface Lambda<T1, T2> {

public T2 eval(T1 t);

}

This interface is the type of first-class functions that take one argument and return one argument. The next thing we'll need is an interface that signifies the fact that my provider class is an atomic nugget, which will allow client's code to be run atomically. As I previously mentioned, and like the lock/unlock solution, this requires some forsight from the developer.

public interface AtomicNugget<T> {

public <S> S atomic(Lambda<T,S> operation);

}

This interface allows the implementer to take a client operation, in the form of a lambda from elements of 'this' type to elements of some other type, and execute that operation atomically. Here's how it might look in some kind of thread-safe stack:

public class AtomicStack<T> implements AtomicNugget<AtomicStack<T>> {

private final Deque<T> stack = new LinkedList<T>();

public synchronized T pop() {

return stack.pop();

}

public synchronized void push(T e) {

stack.push(e);

}

public synchronized boolean isEmpty() {

return stack.isEmpty();

}

public synchronized <S> S atomic(Lambda<AtomicStack<T>, S> operation) {

return operation.eval(this);

}

This particular case is pretty simple. Note how AtomicStack<T> implements AtomicNugget<AtomicStack<T>>. This will ensure that the first argument of the lambda is the stack itself. The client is free to return any type they want from their atomic operation, which we allow by parameterizing the atomic method. Every public method of the AtomicStack is synchronized, including atomic, so that when a client calls atomic, everything that is executed inside of the operation will be done in the confines of that synchronized block. Here's a client side use:

AtomicStack<Integer> stack = new AtomicStack<Integer>();

Integer got =

stack.atomic(new Lambda<AtomicStack<Integer>,Integer>() {

public Integer eval(AtomicStack<Integer> t) {

if( !t.isEmpty() ) return t.pop();

else return null;

}});

In this example, the client creates an anonymous inner class, the "first-class function," which consists of an isEmpty check and a call to pop. This new object is then passed as an argument to the atomic method of the newly created stack. The "lambda" returns the dequed item to the outer context. This is a pretty straightforward example. Here's something a little more interesting. There is a classic problem in concurrency relating to bank accounts. How can I, as a client, using two thread-shared bank accounts withdraw money from one and deposit the money to the other all in one atomic operation? Deadlock is an important concern in this sort of example. Well, we can solve it using AtomicNugget2, which takes two elements of the same type. Internally we are free to synchronize on both object according to our deadlock prevention scheme (in this example, I use the hash code as an ID to determine ordering of lock acquire). First, here is AtomicNugget2:

public interface AtomicNugget2<T> {

public <S> S atomic(Lambda2<T,T,S> operation, T withRespectTo);

}

Pretty similar to AtomicNugget. Now, here is our bank account, which implements the interface:

public class BankAccount implements AtomicNugget2<BankAccount> {

private long amount;

public synchronized void withdraw(long amt) {

amount -= amt;

}

public synchronized void deposit(long amt) {

amount += amt;

}

public <S> S atomic(Lambda2<BankAccount, BankAccount, S> operation,

BankAccount withRespectTo) {

// We acquire locks in order of hash code

if( System.identityHashCode(this) < System.identityHashCode(withRespectTo)) {

synchronized( this ) {

synchronized( withRespectTo ) {

return operation.eval(this, withRespectTo);

}

else {

synchronized( withRespectTo ) {

synchronized( this ) {

return operation.eval(this, withRespectTo);

}

}

Note how we reverse the order of the synchonized blocks depending on the hash ids of the bank accounts. However, we still pass the two bank accounts as arguments to the lambda in the same order. This is crucial for clients who need to know, for example, which is the deposit account and which is the withdrawal account. Finally, here's a client:

BankAccount b1 = //...

BankAccount b2 = // ...

b1.atomic(new Lambda2<BankAccount,BankAccount,VOID>(){

public VOID eval(BankAccount t1, BankAccount t2) {

t1.withdraw(500);

t2.deposit(500);

return VOID.instance;

}}, b2);

The atomic nugget has some problems. Notably, the provider has to be aware that the client's calls will almost certainly be reentrant. If you are using semaphores or some kind of non-reentrant locks, this could be a problem. There are others. Nonetheless, I think in many cases this is a good way to go, and if you are already programming Java in a functional style (a good, if verbose way of doing things) it will seem very natual.

The Atomic Nugget Design Pattern: Part I

I've been telling people about the Atomic Nugget design pattern for a while, but I have not yet described it in detail anywhere. Atomic Nugget is a design pattern of my own creation that allows for better encapsulation with respect to synchronization in Java and similar languages. And yes, I mostly just like it because of the name, but I do happen to think it's legitamately useful. In this post, I'll just briefly talk about why Atomic Nugget is useful. In the next post, I'll give the code.

In Java, there are simple ways to protect the integrity of your objects with respect to multi-threading if you are the class implementer; you can make all of your public methods synchronized. This will ensure that access from multiple threads to an object instance are effectively serialized. (Of course, it's not always as simple as that, but to a first approximation, this strategy works.) However, what if you are implementing a class that depends on other thread-shared classes? For example, the following worker thread references a thread-shared work queue that is internally synchronized.

class WorkerThread {
private Deque workQueue = // ...
public void run() { while(true) { if( !this.workQueue.isEmpty() ) doWork( this.workQueue.pop() ) } }
}

Please ignore the fact that this worker class has many problems. I'm trying to keep the examples short. There is one problem I would like you to focus on, however. The problem is the race condition. If another thread removes the last item from the work queue in between the time this thread calls isEmpty, and the time it calls pop, an exception will be thrown on the call to pop. This is a symptom of a more general problem. The designer of the queue has delineated certain critical sections when they wrote the methods of the queue class. However, as a client, I want to enlarge those critical sections, but in order to do so, I need access to the locks or synchronization primitives that are used internally to the queue. How do people solve this problem now?

I could try to make this work by synchronizing on workQueue. However, this is very brittle and violates encapsulation. It requires me to look inside the implementation of the work queue, and ensure that the methods I want to call are synchronized on the work queue object itself, and not some other private member object. Also, if the implementation of the work queue ever changes, to use a different synchronization strategy, my code will be broken.
The implementer can provide "lock" and "unlock" methods. This requires class implementers to think ahead to imagine how their classes might be used in the future. (My solution will require this too, as you will see.) But it also painfully requires clients to remember to call the unlock method, which in the simple case might not be to much additional work, but in reality requires clients to use a "finally" block, so that exceptions will not cause locks to be held on to indefinitely. And in general, it's just not nice to have to put your synchronization into the hands of your client.

Enter the Atomic Nugget.

Saturday, November 22, 2008

TRANSACT Submission Submitted

Why, oh why does the 'backspace' key go 'back' to the previous page in all browsers? This is a bug, not something helpful. I just lost the entire entry that I was composing. This is worse than the 'insert' key.

Last night we finally submitted the paper that I have been spending all my time on for the past 3 weeks. We submitted to TRANSACT 2009, a workshop on transactional memory. Our paper was on an optimization of STM using the static aliasing annotations that we have been working on in our group.

Honestly, the process was pretty painful. We had to do benchmarks, which is not something I am used to, and we had a couple fall through at the very last moment, because we could not explain why we were seeing the results we were seeing. I now have a lot more repsect for people who work in systems.

Anyway, tonight I am rewarding myself by going to the Girl Talk concert. I've never seen him live, but his last two albums are in super heavy rotation at home and at work. (Seriously, check the stats.) Also, Grand Buffet is opening, and they are sweet.

Monday, November 17, 2008

Vargo V. Vargomaxx

In honor of my not posting in two weeks (seriously, I have a paper deadline this Friday!) I would like to alert you to the following LOLs:

Generalized Super Mario Bros. is NP-Complete

This was a presentation from SIGBOVIK 2007, and I thought it was hilarious at the time, but never remembered to post it. (Also it wasn't online.) It's kind of nerdy, but hilarious.

Monday, November 3, 2008

Paper Poster

Okay, they finally posted my OOPSLA paper on the ACM web site. The official address is:
"Verifying Correct Usage of Atomic Blocks and Typestate"