Lately I have been trying to improve my skills on lock-free algorithms. So I have been studying the literature and the javadoc of the atomic package. But I also felt that I needed to apply them on a concrete programming exercise to actually do progress grasping the important concepts. So I was trying hard to come up with an idea, but for some reason I couldn’t. This changed during last week’s Heinz’s happy hour webinar. At some point a comment by Piotr Findeisen prompted Heinz to talk about “racing threads”, “which thread won and did them all”. All this talk triggered my imagination and I somehow visualized LightCycles from the Movie Tron racing against each other playing chicken. 🙂

This post is about how I implemented a set of program in Java that models this imaginative scenario using two Threads and how I though it through to eventually get to one which uses a lock-free algorithm. I decided to share this for two reasons:

  • One because I found it quite cool to work on this, and would have loved to read it myself! 🙂

  • Secondly I am hoping this will prove helpful to others trying to think on these topics. I had quite some trouble reasoning about this so I will try to explaine how I thought about it and reached a solution that seems to work!

NOTE: This post is documenting my personal journey to come up with a lock-free algorithm, so please be mindful. This information could be totally wrong. According to the following link, it is difficult to implement lock-free algorithms and there are cases where even peer-reviewed published implementations from experts have been found to be with problems http://www.drdobbs.com/cpp/lock-free-code-a-false-sense-of-security/210600279

The problem

We have a racetrack of three lanes and is modeled as a two dimmensional array of chars. Two threads ‘race’ on this track and leave their mark in the form of storing their char name on each array cell. The two racing threads both start on the middle lane, one at the start and the other at the end and move towards each other. When the two threads meet they should change lanes without ‘crashing’ ie they must both change lanes and go to a lane that is not occupied at exactly the meeting point.

Also it is possible for one of the threads to finish the race before the other has stepped onto the first cell, resulting in a result where each solely and completely occupies one lane.

Another twist that I have added is that both threads when the meed they will both compete to get to the upper lane.

In my programs I have named the forward racer ‘f’, the backwards racer ‘b’ and for the racetrack to be visible I prefil it with ‘.’

So here are two examples of ‘legal’ outcomes:

and here are two examples of ‘crashes’:
on this one we see that the threads have written over each
other:

and on this one they have not changes lanes and overwrote
each other’s markings:

The unsynchronized version

This is the unsynchronized version of the program. With a small lenght of the racetrack, this is pretty difficult to show to be broken, because the thread that starts will move very fast, mark the entire middle lane before the other even starts ie winning the race. This is similar to the condition we were observing on Heinz’s code the first thread would win and do everything.

One other interesting thing to note, is that on my hardware when the lenght is large enough, and the do meet, I was unable to observer them switching lanes properly an they would always crash.

private static boolean upperLaneClaim = false;

private final char name;

private int position = 0;
private int lane = 1;

private Racer opponent;

...

@Override
    public void run() {
        int direction = position == 0
                ? 1
                : -1;

        for (; position <= TRACK_LENGTH - 1 && position >= 0; position += direction) {
            boolean haveMet = direction > 0
                    ? opponent.position < position
                    : position < opponent.position;

            if (haveMet) {
                if (upperLaneClaim == false) {
                    upperLaneClaim = true;
                    lane--;
                    opponent.lane++;
                }
            }
            raceTrack[lane][position] = name;
        }
    }

The locking version

Before moving on to tackle the lock free version I decide to also work a bit on a locking version to help me better understand the problem.

My intention was to take all read and write operation that the threads do and put them inside a critical section. So I reworked the for loop to an explicit form and wrapped each iteration inside a ReentrantLock lock and unlock section. I am sure this is not the most optimal and stylish way to achieve a solution, but this way I was pretty confident that I was actually protecting everthing I needed.

private static boolean upperLaneClaim = false;

private static Lock lock = new ReentrantLock();

private final char[][] raceTrack;
private final char name;

private int position = 0;
private int lane = 1;

private Racer opponent;
...

@Override
    public void run() {
        int direction = position == 0
                        ? 1
                        : -1;

        for (;;) {
            lock.lock();
            try {
                if (position < 0 || position > TRACK_LENGTH - 1) {
                    break;
                }
                boolean haveMet = direction > 0
                        ? opponent.position < position
                        : position < opponent.position;

                if (haveMet) {
                    if (upperLaneClaim == false) {
                        upperLaneClaim = true;
                        lane--;
                        opponent.lane++;
                    }
                }
                raceTrack[lane][position] = name;
                position += direction;
            } finally {
                lock.unlock();
            }
        }
    }

First attemps at lock-free

If we look at the code I had written for the locked version we can see two cases of compare and set. I knew that for lock-free algorithms this is a good place to start. The most obvious one is

if (upperLaneClaim == false) {
   upperLaneClaim = true;
   ...
}

so I changed this to

if (upperLaneClaimed.compareAndSet(false, true)) {

The expectation was that by doing this, the racers wouldn’t be able to both occupy the upper lane. This worked, but I was left with the following problem: While one thread was trying to update the lane information for both, the other thread continued racing and overshooting the meeting point. I played a lot with this, trying various changes. Like spinning the other thread while the first one was doing it’s updates but nothing worked. The truth is that I was frustrated and I knew that I was missing the point, but continued due to grit. So I decided to call it a day and move on.

Revelation #1

A big issue I had when working on this was that I couldn’t find some easy examples to see. I could either simple out of context examples or very complex examples where the basic points were lost in the overall complexity of them.

Then I happened to come across this presentation https://www.cs.cmu.edu/~410-s05/lectures/L31_LockFree.pdf. In this presentation there is an example where there is a simple example of push and pop on a stack. At last there was an example in a context that I understood(a stack).

This led to my first revelation:

I need to make the threads compete to operate on the same thing at the same time for this to work.

Now that I am writing this it feels very weird, because it looks so obvious now. Yet I couldn’t see it before.

Ok, so with this revelation, how did that apply to my case? What is it that my racers are really competing on? Well, they are competing (a) to write to the racetrack, (b) to the claim to write to the upper lane. In case of (a) the unit of competition is each cell in the two dimensional array.

This led me to think why not use an atomic array instead of a simple char array and I used AtomicIntegerArray.

The lock free with atomic racetrack version

The above revelation led to the following version:

private final AtomicIntegerArray[] raceTrack;
private final AtomicBoolean upperLaneClaim;
private final char name;

private int position = 0;

...

    @Override
    public void run() {
        int direction = position == 0
                ? 1
                : -1;
        int lane = 1;

        for (; position <= TRACK_LENGTH - 1 && position >= 0; position += direction) {
            if (!raceTrack[lane].compareAndSet(position, '.', name)) {
                if (upperLaneClaim.compareAndSet(false, true)) {
                    lane = 0;
                    raceTrack[lane].set(position, name);
                } else {
                    lane = 2;
                    raceTrack[lane].set(position, name);
                }

            }
        }
    }

This works by trying to atomicaly update the marking on the racetrack to the thread’s name and expecting it to be on it’s default value of ‘.’ on the middle lane. When this fails we know that the other thread has already beat us to it and then we race to claim the right to write on the upper lane. If our claim wins it means that we can now write to the upper lane so we set our local variable to point to lane = 0. Otherwise we have lost the claim, so now we need to write to the bottom lane, hence we assign lane=2. On the susequent iterations we will be writting to our respective lanes that are empty so raceTrack[lane].compareAndSet(position, '.', name) will always return true and we will never enter the if again finishing our run.

The atomic final version

When I finished we the above, I was elated for making it work in a lock free way but there was still something that I did not like. I had changed the data type of the racetrack when I would have wanted it still be a two dimensional char array. I also knew that I did not really need to write to the top and bottom lane atomically so they were superfluous. Furthermore I needed to compareAndSet with the current array content of ‘.’ which is actually only for decorative purposes. So I wrote the following version:

private final char name;
private int startPosition = 0;

private final char[][] raceTrack;

private final AtomicIntegerArray centerLaneClaims;
private final AtomicBoolean upperLaneClaim;

...

    @Override
    public void run() {
        int direction = startPosition == 0
                ? 1
                : -1;

        boolean claimsComplete = false;
        int lane = 1;

        for (int position = startPosition;
             position <= TRACK_LENGTH - 1 && position >= 0;
             position += direction) {
            if (!claimsComplete && centerLaneClaims.compareAndSet(position, 0, 1)) {
                if (upperLaneClaim.compareAndSet(false, true)) {
                    lane = 0;
                } else {
                    lane = 2;
                }
                claimsComplete = true;
            }
            raceTrack[lane][position] = name;
        }
    }

In this I have kept the racetrack as char[][], but also introduced the AtomicIntegerArray centerLaneClaims which keeps the claims to write on the center lane. So racers in order to write to the each cell on the center lane of the racetrack they need to compete on it’s repspective entry on the atomic array. For the uperlane claim it is the same as the previously shown version.

The other addition is the introduction of the claimsComplete local variable. This will be set to true when a racer has either won or lost all the claims, which intents to accelarate the racers by not having to use atomic operations after meeting. I have struggled a bit to reason whether this was correct to do. Inside a single Thread the happens-before order obeys the program order (first bullet at Memory Consistency Properties)

So I came to the conclusion that since all the operation are either on local variables or on atomic types this must be correct.

Revelation #2

The other thing that helped me in my thought process was to try to think only from one point of view. I am not sure whether it would be applicable and helpful on other algorithms but it helped my a lot in this case.

Let me elaborate.. On my first attempt I tried to think about the two threads at the same time. So my pattern of thinking was similar to the folowing:

“What am I trying to achieve, when the other thread tries to achieve that and what will happen when I find this and the other thread finds that and what should I do when the other thread does that?”

I did make progress when I shifted my perspective to thinking in a pattern similar to this:

“What am I trying to achieve, what will happen if I find this, and what should I do if I find it different?”

Correctness

I can not sure whether the above final version is correct but in this case, I actually feel pretty good about it. I would love it if someone found a bug and reported it to me since that would mean I have something more to learn!

Conclusion

BTW You can find the code on github here!

This has been a lot of fun to work on and write about. I hope that you also found it interesting! Also I do hope, if you have been struggling with these concepts, you will find my reasonings helpful in your quest to reach your own!

Thanks for reading!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s