James Michael Hare

...hare-brained ideas from the realm of software development...
posts - 137 , comments - 1106 , trackbacks - 0

My Links

News

Welcome to my blog! I'm a Sr. Software Development Engineer in Seattle, WA. I've been doing C++/C#/Java development for over 18 years, but have definitely learned that there is always more to learn!

All thoughts and opinions expressed in my blog and my comments are my own and do not represent the thoughts of my employer.

Blogs I Read

MCC Logo MVP Logo

Follow BlkRabbitCoder on Twitter

Tag Cloud

Archives

Post Categories

C#/.NET Little Wonders: Interlocked CompareExchange()

Once again, in this series of posts I look at the parts of the .NET Framework that may seem trivial, but can help improve your code by making it easier to write and maintain. The index of all my past little wonders posts can be found here.

Two posts ago, I discussed the Interlocked Add(), Increment(), and Decrement() methods (here) for adding and subtracting values in a thread-safe, lightweight manner.  Then, last post I talked about the Interlocked Read() and Exchange() methods (here) for safely and efficiently reading and setting 32 or 64 bit values (or references). 

This week, we’ll round out the discussion by talking about the Interlocked CompareExchange() method and how it can be put to use to exchange a value if the current value is what you expected it to be.

Dirty reads can lead to bad results

Many of the uses of Interlocked that we’ve explored so far have centered around either reading, setting, or adding values.  But what happens if you want to do something more complex such as setting a value based on the previous value in some manner?

Perhaps you were creating an application that reads a current balance, applies a deposit, and then saves the new modified balance, where of course you’d want that to happen atomically.  If you read the balance, then go to save the new balance and between that time the previous balance has already changed, you’ll have an issue! 

Think about it, if we read the current balance as $400, and we are applying a new deposit of $50.75, but meanwhile someone else deposits $200 and sets the total to $600, but then we write a total of $450.75 we’ve lost $200!

Now, certainly for int and long values we can use Interlocked.Add() to handles these cases, and it works well for that.  But what if we want to work with doubles, for example? 

Let’s say we wanted to add the numbers from 0 to 99,999 in parallel.  We could do this by spawning several parallel tasks to continuously add to a total:

   1: double total = 0;
   2:  
   3: Parallel.For(0, 10000, next =>
   4:     {
   5:         total += next;
   6:     });

Were this run on one thread using a standard for loop, we’d expect an answer of 4,999,950,000 (the sum of all numbers from 0 to 99,999). 

But when we run this in parallel as written above, we’ll likely get something far off.  The result of one of my runs, for example, was 1,281,880,740.  That is way off!  If this were banking software we’d be in big trouble with our clients. 

So what happened?  The += operator is not atomic, it will read in the current value, add the result, then store it back into the total.  At any point in all of this another thread could read a “dirty” current total and accidentally “skip” our add.  

So, to clean this up, we could use a lock to guarantee concurrency:

   1: double total = 0.0;
   2: object locker = new object();
   3:  
   4: Parallel.For(0, count, next =>
   5:     {
   6:         lock (locker)
   7:         {
   8:             total += next;
   9:         }
  10:     });

Which will give us the correct result of 4,999,950,000.  One thing to note is that locking can be heavy, especially if the operation being locked over is trivial, or the life of the lock is a high percentage of the work being performed concurrently.  In the case above, the lock consumes pretty much all of the time of each parallel task – and the task being locked on is relatively trivial.

Now, let me put in a disclaimer here before we go further:

For most uses, lock is more than sufficient for your needs, and is often the simplest solution!   

So, if lock is sufficient for most needs, why would we ever consider another solution?  The problem with locking is that it can suspend execution of your thread while it waits for the signal that the lock is free.  Moreover, if the operation being locked over is trivial, the lock can add a very high level of overhead.  This is why things like Interlocked.Increment() perform so well, instead of locking just to perform an increment, we perform the increment with an atomic, lockless method.

As with all things performance related, it’s important to profile before jumping to the conclusion that you should optimize everything in your path.  If your profiling shows that locking is causing a high level of waiting in your application, then it’s time to consider lighter alternatives such as Interlocked.

CompareExchange() – Exchange existing value if equal some value

So let’s look at how we could use CompareExchange() to solve our problem above.  The general syntax of CompareExchange() is:

  • T CompareExchange<T>(ref T location, T newValue, T expectedValue)
    • If the value in location == expectedValue, then newValue is exchanged.  Either way, the value in location (before exchange) is returned.

Actually, CompareExchange() is not one method, but a family of overloaded methods that can take int, long, float, double, pointers, or references.  It cannot take other value types (that is, can’t CompareExchange() two DateTime instances directly).  Also keep in mind that the version that takes any reference type (the generic overload) only checks for reference equality, it does not call any overridden Equals().

So how does this help us?  Well, we can grab the current total, and exchange the new value if total hasn’t changed.  This would look like this:

   1: // grab the snapshot
   2: double current = total;
   3:  
   4: // if the total hasn’t changed since I grabbed the snapshot, then
   5: // set it to the new total
   6: Interlocked.CompareExchange(ref total, current + next, current);

So what the code above says is: if the amount in total (1st arg) is the same as the amount in current (3rd arg), then set total to current + next (2nd arg).  This check and exchange pair is atomic (and thus thread-safe).

This works if total is the same as our snapshot in current, but the problem, is what happens if they aren’t the same?  Well, we know that in either case we will get the previous value of total (before the exchange), back as a result.  Thus, we can test this against our snapshot to see if it was the value we expected:

   1: // if the value returned is != current, then our snapshot must be out of date
   2: // which means we didn't (and shouldn't) apply current + next
   3: if (Interlocked.CompareExchange(ref total, current + next, current) != current)
   4: {
   5:     // ooops, total was not equal to our snapshot in current, what should we do???
   6: }

So what do we do if we fail?  That’s up to you and the problem you are trying to solve.  It’s possible you would decide to abort the whole transaction, or perhaps do a lightweight spin and try again.  Let’s try that:

   1: double current = total;
   2:  
   3: // make first attempt...
   4: if (Interlocked.CompareExchange(ref total, current + i, current) != current)
   5: {
   6:     // if we fail, go into a spin wait, spin, and try again until succeed
   7:     var spinner = new SpinWait();
   8:  
   9:     do
  10:     {
  11:         spinner.SpinOnce();
  12:         current = total;
  13:     }
  14:     while (Interlocked.CompareExchange(ref total, current + i, current) != current);
  15: }
  16:  

This is not trivial code, but it illustrates a possible use of CompareExchange().  What we are doing is first checking to see if we succeed on the first try, and if so great!  If not, we create a SpinWait and then repeat the process of SpinOnce(), grab a fresh snapshot, and repeat until CompareExchnage() succeeds.  You may wonder why not a simple do-while here, and the reason it’s more efficient to only create the SpinWait until we absolutely know we need one, for optimal efficiency.

Though not as simple (or maintainable) as a simple lock, this will perform better in many situations.  Comparing an unlocked (and wrong) version, a version using lock, and the Interlocked of the code, we get the following average times for multiple iterations of adding the sum of 100,000 numbers:

   1: Unlocked money average time: 2.1 ms
   2: Locked money average time: 5.1 ms
   3: Interlocked money average time: 3 ms

So the Interlocked.CompareExchange(), while heavier to code, came in lighter than the lock, offering a good compromise of safety and performance when we need to reduce contention.

CompareExchange() - it’s not just for adding stuff…

So that was one simple use of CompareExchange() in the context of adding double values -- which meant we couldn’t have used the simpler Interlocked.Add() -- but it has other uses as well.

If you think about it, this really works anytime you want to create something new based on a current value without using a full lock.  For example, you could use it to create a simple lazy instantiation implementation.  In this case, we want to set the lazy instance only if the previous value was null:

   1: public static class Lazy<T> where T : class, new()
   2: {
   3:     private static T _instance;
   4:  
   5:     public static T Instance
   6:     {
   7:         get
   8:         {
   9:             // if current is null, we need to create new instance
  10:             if (_instance == null)
  11:             {
  12:                 // attempt create, it will only set if previous was null
  13:                 Interlocked.CompareExchange(ref _instance, new T(), (T)null);
  14:             }
  15:  
  16:             return _instance;
  17:         }
  18:     }
  19: }

So, if _instance == null, this will create a new T() and attempt to exchange it with _instance.  If _instance is not null, then it does nothing and we discard the new T() we created.

This is a way to create lazy instances of a type where we are more concerned about locking overhead than creating an accidental duplicate which is not used.  In fact, the BCL implementation of Lazy<T> offers a similar thread-safety choice for Publication thread safety, where it will not guarantee only one instance was created, but it will guarantee that all readers get the same instance. 

Another possible use would be in concurrent collections.  Let’s say, for example, that you are creating your own brand new super stack that uses a linked list paradigm and is “lock free”.  We could use Interlocked.CompareExchange() to be able to do a lockless Push() which could be more efficient in multi-threaded applications where several threads are pushing and popping on the stack concurrently.

Yes, there are already concurrent collections in the BCL (in .NET 4.0 as part of the TPL), but it’s a fun exercise!  So let’s assume we have a node like this:

   1: public sealed class Node<T>
   2: {
   3:     // the data for this node
   4:     public T Data { get; set; }
   5:  
   6:     // the link to the next instance
   7:     internal Node<T> Next { get; set; }
   8: }

Then, perhaps, our stack’s Push() operation might look something like:

   1: public sealed class SuperStack<T>
   2: {
   3:     private volatile T _head;
   4:  
   5:     public void Push(T value)
   6:     {
   7:         var newNode = new Node<int> { Data = value, Next = _head };
   8:  
   9:         if (Interlocked.CompareExchange(ref _head, newNode, newNode.Next) != newNode.Next)
  10:         {
  11:             var spinner = new SpinWait();
  12:  
  13:             do
  14:             {
  15:                 spinner.SpinOnce();
  16:                 newNode.Next = _head;
  17:             }
  18:             while (Interlocked.CompareExchange(ref _head, newNode, newNode.Next) != newNode.Next);
  19:         }
  20:     }
  21:  
  22:     // ...
  23: }

Notice a similar paradigm here as with adding our doubles before.  What we are doing is creating the new Node with the data to push, and with a Next value being the original node referenced by _head.  This will create our stack behavior (LIFO – Last In, First Out).  Now, we have to set _head to now refer to the newNode, but we must first make sure it hasn’t changed!

So we check to see if _head has the same value we saved in our snapshot as newNode.Next, and if so, we set _head to newNode.  This is all done atomically, and the result is _head’s original value, as long as the original value was what we assumed it was with newNode.Next, then we are good and we set it without a lock!  If not, we SpinWait and try again.

Once again, this is much lighter than locking in highly parallelized code with lots of contention.  If I compare the method above with a similar class using lock, I get the following results for pushing 100,000 items:

   1: Locked SuperStack average time: 6 ms
   2: Interlocked SuperStack average time: 4.5 ms

So, once again, we can get more efficient than a lock, though there is the cost of added code complexity.  Fortunately for you, most of the concurrent collection you’d ever need are already created for you in the System.Collections.Concurrent (here) namespace – for more information, see my Little Wonders – The Concurent Collections Part 1 (here), Part 2 (here), and Part 3 (here).

Summary

We’ve seen before how the Interlocked class can be used to safely and efficiently add, increment, decrement, read, and exchange values in a multi-threaded environment.  In addition to these, Interlocked CompareExchange() can be used to perform more complex logic without the need of a lock when lock contention is a concern.

The added efficiency, though, comes at the cost of more complex code.  As such, the standard lock is often sufficient for most thread-safety needs.  But if profiling indicates you spend a lot of time waiting for locks, or if you just need a lock for something simple such as an increment, decrement, read, exchange, etc., then consider using the Interlocked class’s methods to reduce wait.

Print | posted on Thursday, September 6, 2012 7:14 PM | Filed Under [ My Blog C# Software .NET Little Wonders ]

Feedback

Gravatar

# re: C#/.NET Little Wonders: Interlocked CompareExchange()

I am an avid reader as far as .NET is concerned. But i haven't seen such a nice article with perfect examples. Today only i came across Interlocked and got this link from twitter. It was like wow to read this article. Thanks a lot. Going to read all the previous articles of yours. You ROCKS!!!!
9/7/2012 9:36 AM | Bhavik Barot
Gravatar

# re: C#/.NET Little Wonders: Interlocked CompareExchange()

Hi

The examples of code on this blog are very cool (not just on this post), but tell me one thing:

Why do you spin inside a loop? :)

In reality you spin two times as the outer loop is a way of spinning and the inner SpinOnce() method actually will issue a loop spin, and based of the number of calls the spin counter will grow by a factor of ^2, up to 4096, only then will it begin to employ context switching procedures. I assume the train of thought is thread aligning as each failed call will realign the thread as it needs to do more work then its previous pairs, but actually without the code still a fail will realign the threads.

Performance vise this is not a very good way to do it, and the simplest and better way is to not spin in your loop as you *are* in a loop and therefor spinning, so much better way would be to do a light context (like yield) switch after a certain number of fails (there are better ways but more complex).

I do realize that spinning is not the topic of this article but I just wanted to address that little thing and add my 2cents on the topic overall :)

Best Regards
Bartosz Adamczewski
9/11/2012 3:21 AM | badamczewski
Gravatar

# re: C#/.NET Little Wonders: Interlocked CompareExchange()

I think your hyperlink color is too close to your regular text color. I find it difficult to tell what's a link and what's not. Just sayin'.
9/11/2012 6:59 AM | Anonymous
Gravatar

# re: C#/.NET Little Wonders: Interlocked CompareExchange()

Thanks for the heads up, I'll see if i can play with the theme to clear that up.
9/11/2012 7:12 AM | James Michael Hare
Gravatar

# re: C#/.NET Little Wonders: Interlocked CompareExchange()

@Batrosz: Actually I'm recreating the way the ConcurrentStack was coded in the TPL, yes, spinning can be heavier when locks are long-lived, thus they should only be used for points of contention that are very short-lived, such as waiting on a compare-exchange...
9/11/2012 7:14 AM | James Michael Hare
Gravatar

# re: C#/.NET Little Wonders: Interlocked CompareExchange()

@James: Yes you are right with that, but the point i'm trying to address here that even ConcurrentStack in the TPL should not use SpinWait data structure and implement an internal one if they are aiming for as fast as possible execution.

I actually have written an article about some of the issues with such spinning: http://badamczewski.blogspot.com/2012/08/lock-free-and-spinwait-msdn-example.html

This comment goes way out of the context of your article so all other discussions about spining we should do priv :)
9/11/2012 7:56 AM | badamczewski
Gravatar

# re: C#/.NET Little Wonders: Interlocked CompareExchange()

@badamczewski: not sure i agree with your conclusions, per se, but as you said that's off topic and we can hash out in private :-)
9/17/2012 8:21 AM | James Michael Hare
Gravatar

# re: C#/.NET Little Wonders: Interlocked CompareExchange()

I think the Push example has a bug in it wrt. the use of the keyword 'volatile'.
See http://blogs.msdn.com/b/ericlippert/archive/2011/06/16/atomicity-volatility-and-immutability-are-different-part-three.aspx
9/20/2012 4:22 PM | Ryan Janzen
Gravatar

# re: C#/.NET Little Wonders: Interlocked CompareExchange()

Ryan: Actually volatile is what the TPL uses in its ConcurrentStack as well, the volatility doesn't make it atomic, it just keeps it from being cached.
9/20/2012 4:26 PM | James Michael Hare
Gravatar

# re: C#/.NET Little Wonders: Interlocked CompareExchange()

Two typos in your SuperStack example:
"private volatile T _head" should be "private volatile Node<T> _head";
"var newNode = new Node<int>" should be "var newNode = new Node<T>";
9/24/2012 6:19 AM | Richard
Gravatar

# re: C#/.NET Little Wonders: Interlocked CompareExchange()

@Richard: Thanks! I'll clean those up!
9/26/2012 6:01 PM | James Michael Hare
Post A Comment
Title:
Name:
Email:
Comment:
Verification:
 

Powered by: