Misadventures in Concurrent and Parallel programming, plus random comments on software performance and various OSS contributions.
Last week I gave a talk on non-blocking concurrency at JAX London. Here are the slides:
Mike,I've followed the development of the disruptor and the talks given by you and others with interest.One thing that's always puzzled me is the use of the lazy set. As per its documented semantics it should offer better performance for cases where its guarantees are sufficient. Indeed the graphs on your slide bear this out and the assembly shows different instructions being generated.What puzzles me is that the implementation of lazy set in OpenJDK seems to boil down to a call with volatile semantics and so one would expect it should behave the same as a volatile. But, your evidence suggests otherwise. This post describes the details in the OpenJDK source How-does-AtomicLong-lazySet-work.Any idea/explanation for the discrepancy?RegardsDavid
Hi David,The original bug report and Doug Lea's cookbook give some insight into the difference.My understanding (read: any mistakes are my own) is that a volatile store will provide a couple of guarantees. Between a volatile store and normal store you will get a StoreStore barrier preventing the normal store from be reordered ahead of the volatile store. You will also get a StoreLoad barrier between a volatile store and a volatile load preventing the volatile load from being reordered ahead of the volatile store. The call to lazySet misses out the StoreLoad barrier. While providing the same store ordering behaviour, it is possible for subsequent volatile reads to be reordered ahead of a lazySet.The StoreLoad barrier is the expensive operation. On Intel this is implemented with a LOCK ADD instruction which is elided in the lazySet case.
Hi Mike,Thanks for taking the time to reply.It's not that I don't understand the theory behind the different behaviour and the barriers that need to be in place. It's that I've always thought that lazySet, in practice, ended up using the same underlying mechanisms as set, effectively full volatile semantics.The JDK code seems to indicate that the lazySet ends up calling through to a SET_FIELD_VOLATILE call. Almost as though it was implemented with stronger semantics than specified. I suspect that I'm missing something in the JVM code (I've not checked it all out and crawled through it), perhaps it's being intrinsified somewhere. There is a comment stating "The non-intrinsified versions of setOrdered just use setVolatile" so it may well be that lazySet never gets this far and gets intrinsified before this point. I suspect it must be.Anyhow, I've not used lazySet to date, in situations where it would have had a benefit because of this misplaced belief. A great example where, rather than believing what I thought I was seeing in the code, I should have tested for myself!RegardsDavid
Hi David,No problem.LazySet is definitely intrinsified (on HotSpot anyway). The place to look for intrinsics is vmSymbols.hpp. Most of the methods on the Unsafe are intrinsics. Actually tracing through to the underlying implementation can take some doing, so tend to rely more of the generated assembler to get the a picture of what's actually happening closer to the metal.Mike.
Hi David,Just posted on the topic of lazySet/putOrdered covering documentation and offering some experiments and stats to establish behaviour. Have a read, I hope it helps clarify: http://psy-lob-saw.blogspot.co.uk/2012/12/atomiclazyset-is-performance-win-for.htmlThanks,Nitsan
Post a Comment