@synchronized swimming
October 30th, 2006 | Published in Google Mac Blog
Posted by: Dave MacLachlan, Member of Technical Staff, Mac Team
(Editor's note: today's post is a bit different from our usual fare -- it's aimed at the Mac programmers out there. And if you're not a programmer, you might want to find one to guide you through this peek behind the scenes of how we make our applications fast and reliable.)
At Google, software performance is extremely important. Every millisecond counts, which is why we spend a lot of time using performance tools and other techniques to help make our software faster. I was recently Sharking a piece of multithreaded code and realized we were getting bitten by the use of an
Shark told us without a doubt that we were paying heavily for the
So, I wondered, how exactly does
That's a lot of setup and tear-down for a simple lock around a shared resource. In this case, we don't need to be exception safe. By reading the Objective-C documentation on exception handling and thread synchronization, we learn that not only does
By examining the code (ADC registration required) that implements
We need to do better than this. It looks like it's time to go back to basics, throw away the
This is ugly stuff, but it's significantly faster, according to Shark. And fast is what we want. We've avoided setting up an exception stack, two excess locks, and a bunch of miscellaneous support code.
So we've achieved our goal of faster code that will work fine, but are there other, cleaner options? After all, if the code is cleaner, there are fewer places for bugs to hide. Tune in for our next post, wherein we'll explore that question.
(Editor's note: today's post is a bit different from our usual fare -- it's aimed at the Mac programmers out there. And if you're not a programmer, you might want to find one to guide you through this peek behind the scenes of how we make our applications fast and reliable.)
At Google, software performance is extremely important. Every millisecond counts, which is why we spend a lot of time using performance tools and other techniques to help make our software faster. I was recently Sharking a piece of multithreaded code and realized we were getting bitten by the use of an
@synchronized
block around a shared resource we were using:
+(id)fooFerBar:(id)bar {
@synchronized(self) {
static NSDictionary *foo = nil;
if (!foo) foo = [NSDictionary dictionaryWithObjects:...];
}
return [foo objectWithKey:bar];
}
Shark told us without a doubt that we were paying heavily for the
@synchronized
block each of the millions of times we were calling fooFerBar
. We couldn't create the resource in +initialize
, because fooFerBar
was part of a category, and overriding +initialize
in a category is a bad thing. We also couldn't use +load
, because other classes could have easily called fooFerBar
in their +load
, and there's no guarantee on loading order. So our only choice was to minimize the impact of that @synchronized
block, and we didn't want to run into the infamous and dreaded double-checked locking anti-pattern.So, I wondered, how exactly does
@synchronized
work? And is there a cheaper way of getting the same thread-safe result? I disassembled the code to find out what @synchronized
does, and I saw something like this:
...
objc_sync_enter
objc_exception_try_enter
setjmp
objc_exception_extract
my actual code
objc_exception_try_exit
objc_sync_exit
...
objc_exception_throw
...
That's a lot of setup and tear-down for a simple lock around a shared resource. In this case, we don't need to be exception safe. By reading the Objective-C documentation on exception handling and thread synchronization, we learn that not only does
@synchronized
give us a lock, but it's a recursive lock, which is overkill for this particular usage.By examining the code (ADC registration required) that implements
objc_sync_enter
and obc_sync_exit
, we can see that on every @synchronized(foo)
block, we are actually paying for 3 lock/unlock sequences. objc_sync_enter
calls id2data
, which is responsible for getting the lock associated with foo
, and then locks it. objc_sync_exit
also calls id2data
to get the lock associated with foo
, and then unlocks it. And, id2data
must lock/unlock its own internal data structures so that it can safely get the lock associated with foo
, so we pay for that on each call as well.We need to do better than this. It looks like it's time to go back to basics, throw away the
@synchronized
call, and wrap our code with some pthread
locks instead.
#include
+(id)fooFerBar:(id)bar {
static pthread_mutex_t mtx = PTHREAD_MUTEX_INITIALIZER;
if (pthread_mutex_lock(&mtx)) {
printf("lock failed sigh...");
exit(-1);
}
static NSDictionary *foo = nil;
if (!foo) foo = [NSDictionary dictionaryWithObjects:...];
if (pthread_mutex_unlock(&mtx) != 0)) {
printf("unlock failed sigh...");
exit(-1);
}
return [foo objectWithKey:bar];
}
This is ugly stuff, but it's significantly faster, according to Shark. And fast is what we want. We've avoided setting up an exception stack, two excess locks, and a bunch of miscellaneous support code.
So we've achieved our goal of faster code that will work fine, but are there other, cleaner options? After all, if the code is cleaner, there are fewer places for bugs to hide. Tune in for our next post, wherein we'll explore that question.