Wepricot.fetch('http://google.com') do |result|
puts "URL: #{result.url.absoluteString}, source: #{result.html}"
i = 0
(result / 'img').each do |node|
src = node['src']
resource = result[src]
puts "DOM traversed img (##{i}) with src: #{src}"
puts "resource: #{resource.MIMEType}, #{resource.data.length} bytes"
i += 1
end
end
You wonder why I like MacRuby. I wrote the beginnings of a layer in one and a half hours and 190 lines of code to essentially replace what I use Hpricot for (parsing HTML into a tree), but by using WebKit it gets “native” access to the DOM, selector queries and the loaded resources themselves.
Output:
URL: http://www.google.com/, source: [..snip..]
DOM traversed img (#0) with src: /intl/en_ALL/images/logo.gif
resource: image/gif, 8558 bytes
That is exactly why I love MacRuby. Mad ideas implemented in a jiffy by crossing the streams.
Update: this code is now available.
No comments yet.
Sorry, the comment form is closed at this time.