I was getting reports about DTCoreText having leaks. Not something I like to wake up to, to be honest. So I dived right in with Instruments and the Leaks tool. I am going to share with you something that I learned about Leaks and Blocks that might save you much trouble if you check for that first the next time you profile any app.
I fired up Instruments with the DTCoreText demo app running in simulator and I was dumbfounded. Lots of small red bars told of unchecked leaks. But what was even more unnerving was the sheer amount of different objects that got leaked. No way that all these different “Responsible Frames” where all leaking.
I did what every engineer does at first: find somebody else to blame. So I pointed the finger at Apple, because obviously NSScanner, NSDictionary and NSString are “leaking like crazy”. What other reason could there be for NSDictionary leaking 64 Bytes on every mutableCopy?
In my desperation I wasted about half a day trying to rewrite methods to leak less, like replace NSDictionary’s mutableCopy with initWithDictionary. But the only thing this achieved was to move the leaks elsewhere. Like a sinking boat I found myself unable to plug any holes because for each plugged one a new leak would form.
Rule number 2 for engineers is: if you cannot achieve perfection, then at least rationalize it as “good enough”. Surely it cannot be a problem leaking 192 Bytes every now and then. We all shipped apps leaking more than that, right?
So, it’s all Apple’s fault and my code is good enough….
The Retain Loop Loupe
But then – while aimlessly clicking around the Instruments interface – I discovered the “Cycles & Roots” tool. Bam!
Next to an amazing Picasso I saw some Complex and some Simple Cycles. But the flow chart made a light go on in my head. Obviously the root cause was DTHTMLAttributedStringBuilder having a dictionary that held onto an NSMallocBlock that referred back to the builder class, a classical retain loop. They should name this graphic “The Retain Loupe” because it magnified the problem sufficiently so that even us dumb developers would get it.
To understand what the above means let me briefly explain an architecture choice I made in the attributed string builder class. In the original code I had an enormous if-else if-else if-etc. tree checking for all different kinds of HTML tag names. I changed all this by having two dictionaries of handlers for starting and ending of tags. Each dictionary contains the blocks to be executed when the opening or closing of certain tags are encountered.
@implementation DTHTMLAttributedStringBuilder { // parsing state, accessed from inside blocks DTHTMLElement *currentTag; // lookup table for blocks that deal with begin and end tags NSMutableDictionary *_tagStartHandlers; NSMutableDictionary *_tagEndHandlers; } - (void)_registerTagStartHandlers { void (^blockquoteBlock)(void) = ^ { currentTag.paragraphStyle.headIndent += 25.0 * textScale; currentTag.paragraphStyle.firstLineHeadIndent = currentTag.paragraphStyle.headIndent; currentTag.paragraphStyle.paragraphSpacing = defaultFontDescriptor.pointSize; }; [_tagStartHandlers setObject:[blockquoteBlock copy] forKey:@"blockquote"]; // ... } |
So in the parser delegate method for opening of tags I can see if there is a block registered for a given tag and execute it if it is.
- (void)parser:(DTHTMLParser *)parser didStartElement:(NSString *)elementName attributes:(NSDictionary *)attributeDict { // find block to execute for this tag if any void (^tagBlock)(void) = [_tagStartHandlers objectForKey:elementName]; if (tagBlock) { tagBlock(); } } |
Isn’t it beautiful?
When the blocks are created the compiler checks which variables are used. If you use a local variable then it is copied. Any objects used thusly are also retained during the lifetime of the block. If you use an instance variable (currentTag) then this also means that self will be captured and retained. The reason for this is that the runtime needs to make sure that all needed objects stay around while the block lives.
Michael Mangold points us to Lecture 10 of Standford’s iOS Development course, which explains the block basics as well.
Simple Fix
The problem with this is that each of these blocks that accesses an ivar retains self, but the string builder class also retains the dictionaries which in turn retain the blocks. Vicious loop, fortunately with a simple fix.
The simple fix for this is to make sure that the two dictionaries are released when no longer needed. In ARC you force a release by nilling the only reference to them.
- (BOOL)buildString { // register default handlers [self _registerTagStartHandlers]; [self _registerTagEndHandlers]; // string building // clean up handlers because they retained self _tagStartHandlers = nil; _tagEndHandlers = nil; return result; } |
That’s all it takes, boom! Gone are all leaks.
So these “Leaks” where no real leaks. They where just the interim objects that made up the NSAttributedString that the string builder produced while building the string from HTML.
Weak Self
The more complicated option revolves around having a way to access IVARs and properties without retaining self. Since the blocks are retained by the dictionaries which are retained by the string builder there is no scenario where the block would still live without the string builder still around. So it is perfectly safe to convert self into a weak reference.
__weak DTHTMLAttributedStringBuilder *weakself = self; void (^someBlock)(void) = ^ { weakself.someProperty = @"bla"; [weakself callMethod:@"bar"]; }; |
Using this approach you can access all properties and methods of self without causing self to be retained. The final piece of the puzzle is how you can access IVARs for which you didn’t create a property. Maybe you don’t want to have the unnecessary Objective-C messaging overhead for just setting a value?
For accessing IVARs Objective-C borrows the arrow operator from C++. This operator is usually used in C++ if you have a pointer to a struct or C++ class and want to access one of the member variables. In Objective-C any IVAR really is just a variable inside a structure too, one that self points to.
__weak DTHTMLAttributedStringBuilder *weakself = self; void (^someBlock)(void) = ^ { weakself->ivar = @"bla"; }; // for example: void (^blockquoteBlock)(void) = ^ { weakself->currentTag.paragraphStyle.firstLineHeadIndent = weakself->currentTag.paragraphStyle.headIndent; weakself->currentTag.paragraphStyle.paragraphSpacing = weakself->defaultFontDescriptor.pointSize; }; |
This is your option number 2, prefixing every property with “weakself.” and every IVAR with “weakself->”.
Ugh, so much extra code! I wish there was some option in ARC to tell the compiler not to retain self to be able to avoid all this extra code.
Conclusion
When debugging/profiling an application that makes use of blocks be sure to check the “Cycles&Roots” view first because those the the easiest to fix. Don’t waste your time fiddling around with the “obvious leaks” and also refrain from blaming Apple.
Of the two methods to avoid a self-retain-cycle I personally prefer the one using less code, i.e. explicitly releasing a container holding onto blocks. But in some situations this might not be an option, so you have to resort to the weakself strategy.
Steve Flack recommends reading Mike Ash’s piece on retain cycles and how these behave with ARC.
Categories: Recipes
I believe there’s a great flaw in your preferred (simple) method of resolving these ‘retain cycles with blocks’ problems. This approach doesn’t resolve the essence of the problem – presence of the retain cycle. It is very error prone and can be considered as a bad design in general. The problem is that it adds a very strong responsibility to every piece of code using these stored blocks – the cycle needs to be broken always after blocks are no longer needed.
Actually the current version of the source I’m seeing right now illustrates my point in a best possible way. When you applied the fix you’ve forgot about additional return point in the -buildString:
https://github.com/Cocoanetics/DTCoreText/blob/c147030ab28cfed99288cefc84ba5a6e77b35198/Core/Source/DTHTMLAttributedStringBuilder.m#L148
Obviously every time the condition is met and function returns an indestructible retain cycle is created and memory is leaked.
The ‘weak self’ approach obviously doesn’t have this disadvantage – i.e. it doesn’t create a cycle in the first place. It has it’s own problems though – it makes code less readable and It is easy to break it by accidentally missing a single ‘self.’ within the block. But I believe it is still a lot better decision design-wise.
To sum up: If you are retaining a block (directly or indirectly) always use ‘weak self’ within the block. Never use a ‘break after use’ approach.
Nice analysis! I especially like the use of Instruments to track it down.
Just one totally pedantic comment – the arrow operator is from C, and predates C++ by a long time.