Archive for July, 2008

h1

TestRR

July 30, 2008

As mentioned in my previous post, I have open-sourced a very generic and lightweight version of the robustness framework in Aloha. The name, very unimaginatively is TestRR, and is available on google code.

For the moment, the features and documentation available on the download are quite basic. The framework and basic functionality are there, and more will be added if there is any demand at all for it.

I hope people download it, try it and give me some feedback to improve it!

Advertisements
h1

Optimistic Concurrency Control

July 17, 2008

There are several important lessons that can be gleaned from Aloha and easily applied to other projects. One major lesson is the integration of robustness and performance builds into the continuous integration environment. This is of course the topic I am presenting on at Agile2008 along with Robbie and Fab (As a side-note, I am currently working on creating an extremely simple and lightweight robustness and performance framework based on Aloha’s model that can easily be integrated into any Java project with a minimum of fuss. I will be putting out what I have in open-source land, hopefully early next week. I also currently don’t have a name for this framework – I am calling it RPFramework for now, but would like a better, cooler name without the word framework in it. Any suggestions would be welcome!)

But what I want to focus on right now, as the title would suggest, is the optimistic concurrency control model implemented in Aloha. It isn’t the easiest or most intuitive mechanism to implement in any multi-threaded application. Most applications tend to go the pessimistic route with some locking mechanism, typically using semaphores or other language constructs such as the synchronized keyword. From my experiences in trying to describe how the optimistic model works to others (typically colleagues joining the Aloha team), developers, even highly experienced and competent ones, have loads of trouble really grasping the concepts behind it. Their first few dabbles at working with the implementation are inherently wrong as they struggle to cope with the intricacies behind the model.

One (un-scientific) way of gauging the effectiveness of the model is to look at other applications that use it. Among databases, Oracle is the only major player which uses such a model. All the others use a pessimistic locking mechanism – it is however hard to pinpoint this as the primary reason (or even one of them) why Oracle outperforms most other RDBMS. Java itself uses this modelin its implementation of the increment() method in AtomicInteger. And just last night, I read a very interesting article that delved into the depths of Transaction Memory (TM).

Transaction Memory attempts to make life easier for programmers who work on concurrent applications. It gives the developer two of the ACID properties that databases give you – atomicity and isolation. Therefore, you get to write code, never worrying about concurrency and synchronicity. TM allows you to work as if you were writing a single-threaded application, not worrying about locks et al. Now, how does TM work under the covers? There are two main flavours – STM and HTM (software and hardware versions). The hardware versions, as always, are better performing but much harder to implement and the more complex functionality is currently implemented in software. So what does TM have to do with the subject of this post? – Apparently, most TM systems are implemented using an optimistic model, using nearly all of the same mechanisms implemented in Aloha. There are two notable differences: its transparency to the developer and the retry mechanism. Let me first describe how the optimistic model itself works and then I will go into a bit more detail about these differences.

Instead of going into the textbook details of an optimistic concurrency model, let me describe the optimistic model as implemented in Aloha and why we chose this path. Since there is no locking in an optimistic model, any thread looking to read shared state can do so instantaneously. Writing is more complicated and can lead to delays and retries – but if the majority of data access is for reading (which is true in most applications), you really have reason to be optimistic. When you read shared state, what you really get is a cloned copy of the state. Any mutations on this state is local; it affects no one else. When you try to write the changes back into the state store, the following happens:

  • If no other thread has saved a newer version into the collection since the one you read, your changes are saved and no further action is required.
  • If some other thread(s) has saved newer changes, all your changes are rolled back and you get to try your entire “transaction” again.
  • If a ConcurrentUpdateBlock has rolled back 10 times, it isn’t retried again as it is extremely unlikely it will ever succeed.

The two main differences between optimistic concurrency control as implemented in Aloha and in a TM are:

  • In TM, the rollback and re-try mechanisms are done transparently. In Aloha, this is done explicitly by writing code which accesses and changes shared state in ConcurrentUpdateBlocks. Each ConcurrentUpdateBlock is then invoked through the ConcurrentUpdateManager which handles the retry mechanism for you. Therefore, the developer has to be very aware of shared state accesses and has to understand how the model works while writing code.
  • The other big difference is the retry mechanism. Aloha retries each ConcurrentUpdateBlock 10 times. The TM system that was described in the article I read implemented a more complex system using priority levels which increased with each failed commit. This is obviously better to guarantee fairness but we didn’t see the need to implement it until an issue became apparent.

And finally, Aloha’s state and concurrency models are easily extensible such that applications built on top of Aloha can use them with next to no effort.

h1

3 more months to Anand-Kramnik

July 8, 2008

I am still counting down the days to the big match coming up in October. Somewhat to my surprise, both players are choosing to stay reasonably active leading up to the match. I had assumed that they would both shut down tournament play post-Linares and start up their preparations. Instead, Kramnik’s just finished playing Dortmund and has apparently agreed to defend his title at the Tal Memorial in September. Vishy is playing the Mainz Classic (which he always seems to win) but more importantly, is playing the Grand Slam Final in September.

I guess the strategy is 3-fold:

  • Shake off the rust and get some real games under your belt.
  • Don’t let any opening secrets out of your bag. (I guess a side-effect of this will be that they won’t play any spectacular novelties and will probably not contend in these tournaments)
  • Play openings you wouldn’t normally play, just to make your opponent waste time conducting his due diligence on it. For example, Kramnik unleashed the Grunfeld at Dortmund. By the same token, I fully expect Anand to open with d4 and/or Nf3 at the Grand Slam Final – in fact, he might very well have to do that sooner or later in the match.

In addition, Kramnik’s agreed to lead the Russian team at the Chess Olympiad two weeks after the end of the match. Vishy hasn’t yet made a similar commitment to the Indian team and I don’t think he will either – the last Olympiad, where he rushed at the end of the M-Tel to join the team, was an absolute disaster for him.

And finally, I wonder what Kramnik’s ranking will be when the match starts. The next ratings are due to be published on October 1st and Ivanchuk, Carlsen, Morozevich and Topalov are all ahead of him in the live ratings.

h1

Cedric’s Coding Challenge

July 3, 2008

I thought I would take an attempt at the coding challenge presented by Cedric. Most of the solutions presented in the comments to the post seemed to involve brute force using regex manipulations, with people complaining that the implementations were quite slow. I thought I would try a different implementation, a highly customised one for the specific problem stated, but one which runs in 734ms on my not-so-fast development machine.

One interesting sidenote to add. I was never good at combinatorics and I am probably quite wrong at this, but I figure that the number of combinations I have to worry about as the number of digits increase probably increases quite alarmingly. When dealing with 2 digit numbers, I had only C(2,2) = 1 combination to worry about. With 3 digit numbers, C(3,2) = 3 and with 4 digit numbers, C(4,2) = 6. So the next increase would give me C(5,2) = 10 different variations to worry about. This isn’t critical for this particular challenge, but for a real-world problem, I would have to generalise the solution!

public class CedricCodingChallenge {
  public static void main(String[] args) {
    CedricCodingChallenge codingChallenge =
      new CedricCodingChallenge();

    long startTime = System.currentTimeMillis();
    List<String> list = codingChallenge.populateList(1, 10000);
    long endTime = System.currentTimeMillis();

    System.out.println(Arrays.toString(list.toArray()));
    System.out.println(String.format(
      "Size of list: %d", list.size()));
    System.out.println(String.format("Biggest jump: %d",
      codingChallenge.getBiggestJump(list, 10000)));
    System.out.println(String.format(
      "Total time: %d", endTime - startTime));
  }

  private List<String> populateList(int start, int end) {
    List<String> whiteList = new ArrayList<String>();
    for (int i=start; i<end; i++)
      whiteList.add(String.format("%d", i));

    for (int i=0; i<=9; i++)
      removeDuplicatesFromWhiteList(whiteList, i);

    return whiteList;
  }

  private void removeDuplicatesFromWhiteList(
    List<String> list, int digitToRemove) {

    // Remove 2 digit numbers
    if (digitToRemove > 0) {
      String numberToRemove =
        String.format("%s%s", digitToRemove, digitToRemove);
      list.remove(numberToRemove);
    }

    for (int i=0; i<=9; i++) {
      if (i > 0) {
        // Remove 3 digit numbers where digit is the last 2
        String numberToRemove = String.format(
          "%s%s%s", i, digitToRemove, digitToRemove);
        list.remove(numberToRemove);
      }

      if (digitToRemove > 0) {
        // Remove 3 digit numbers where digit is the first 2
        String numberToRemove = String.format(
          "%s%s%s", digitToRemove, digitToRemove, i);
        list.remove(numberToRemove);

        // Remove 3 digit numbers where digit is the first and the last
        numberToRemove = String.format(
          "%s%s%s", digitToRemove, i, digitToRemove);
        list.remove(numberToRemove);
      }
    }

    for (int i=0; i<=9; i++)
      for (int j=0; j<=9; j++) {
        if (i > 0) {
          // Remove 4 digit numbers where digit is the last 2
          String numberToRemove = String.format(
            "%s%s%s%s", i, j, digitToRemove, digitToRemove);
          list.remove(numberToRemove);

          // Remove 4 digit numbers where digit is the middle 2
          numberToRemove = String.format(
            "%s%s%s%s", i, digitToRemove, digitToRemove, j);
          list.remove(numberToRemove);

          // Remove 4 digit numbers where digit is 2nd and 4th
          numberToRemove = String.format(
            "%s%s%s%s", i, digitToRemove, j, digitToRemove);
          list.remove(numberToRemove);
        }

        if (digitToRemove > 0) {
          // Remove 4 digit numbers where digit is the first 2
          String numberToRemove = String.format(
            "%s%s%s%s", digitToRemove, digitToRemove, i, j);
          list.remove(numberToRemove);

          // Remove 4 digit numbers where digit is 1st and 3rd
          numberToRemove = String.format(
            "%s%s%s%s", digitToRemove, i, digitToRemove, j);
          list.remove(numberToRemove);

          // Remove 4 digit numbers where digit is 1st and 4th
          numberToRemove = String.format(
            "%s%s%s%s", digitToRemove, i, j, digitToRemove);
          list.remove(numberToRemove);
        }
      }
  }

  private int getBiggestJump(List<String> list, int max) {
    int biggestJump = 1;
    int previous = 0, next = 0;
    ListIterator<String> iterator = list.listIterator();
    while (iterator.hasNext() || previous < max) {
      if (iterator.hasNext())
        next = Integer.parseInt(iterator.next());
      else
        next = max;
      int difference = next - previous;
      biggestJump =
        biggestJump >= difference? biggestJump : difference;
      previous = next;
    }
    return biggestJump;
  }
}