waffle

Ruby the Same Token

Ruby HTML5 tokenizer, written, fittingly, by Sam Ruby, porting the Python HTML5 html5lib tokenizer.

As a result of rigorous specification and careful thought, the HTML5 tokenizer is pretty easy to write AND works on almost all HTML and XHTML documents. I implemented it in Objective-C once, and it supports an unadvertised feature in Monocle currently (all the way back to 1.0) – parsing of Mycroft search engines, so you can click any Mycroft link that works from within Firefox in the ‘add engine’ sheet instead.

No comments yet.

Leave a comment

Your e-mail address is never shown. If you type a line break in the comment, it will show up as a line break (naturally). The following HTML is allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

(required)

(required)


Please note: Your comment will not show up at once. Unless you're spamming or being abusive, you have nothing to worry about. (Read the full policy.)