The parsing of HTML necessary for my DTCoreText open source project is done entirely with the sophisticated use of NSScanner. But it has been long on my list to rewrite all of this parsing with a the industry standard libxml2 which comes preinstalled on all iOS devices. Not only is this potentially much faster in dealing with large chunks of HTML. It probably is also more intelligent in correcting structural errors of the HTML code you throw at it.
In part 1 of this series I showed you how to add the libxml2 library to your project and explained the basic concepts of using libxml2 for parsing any odd HTML snippet into a DOM. In this here part 2 we will create an Objective-C based wrapper around libml2 so that we can use it just like NSXMLParser, only for HTML.