In one of the experiments I'm running for my research, I have to take a snapshot of a page and serve it locally. Of course, if I just grab the HTML, any relative URLs will break and the locally served page is unlikely to look much like the original. So, I put together a bit of code to make the links absolute. I remember trying to do this a few years ago in Python and having enormous headaches, but this Ruby version was relatively painless. That says more about my skills as a coder than anything about the relative (get it?) merits of Python and Ruby.

%w[uri net/http hpricot].each {|lib| require lib}
url = 'http://en.wikipedia.org/wiki/Night'
response = Net::HTTP.get_response(URI.parse(url))
body = Hpricot.parse(response.body)
absolutisable = { 'a' => %w[href],
'applet' => %w[codebase],
'area' => %w[href],
'blockquote' => %w[cite],
'body' => %w[background],
'del' => %w[cite],
'form' => %w[action],
'frame' => %w[longdesc src],
'iframe' => %w[longdesc src],
'head' => %w[profile],
'img' => %w[longdesc src usemap],
'input' => %w[src usemap],
'ins' => %w[cite],
'link' => %w[href],
'object' => %w[classid codebase data usemap],
'q' => %w[cite],
'script' => %w[src],
}
(body/"#{absolutisable.keys.join('|')}").each do |elem|
absolutisable[elem.name].each do |attr|
uri = elem.attributes[attr]
elem.raw_attributes[attr] =
URI::parse(url).merge(uri).to_s unless uri.nil?
end
end
puts body

This code doesn't take into account @import'ing CSS, and internal CSS links like url will break it, but I think it accounts for everything else.

Chris J. Davis has started working on some image editing software focussing on imagery for the Web. The working title of the project is Simpleshop, because the impetus is the huge number of unused Photoshop features when you only use it for Web images. Knowing Chris, Simpleshop will be a complete misnomer before long and the application will definitely be doing it's own thing rather than simply being a cut-down version of the other 'shop. Chris will open source the project when he has a working prototype.
IE8 now renders the “Acid2 Face” correctly in IE8 standards mode.
There's recently been a lot of noise about a return to the browser wars (Alex Russell, Jeff Croft, Stuart Langridge, James Bennett). The point being that standards take eons to complete and standards bodies aren't the right people to be inventing cool stuff for us to use on the web, it's us and the browser makers that should be creating the cool stuff for the standards bodies to codify. Ok, that all sounds great (albeit an incredible simplification of a multifaceted issue). So, let's go out and push that envelope.
In order for the future ...
[read more]
To get a better future, not only do we need a return to “the browser wars”, we need to applaud and use the hell out of “non-standard” features until such time as there’s a standard to cover equivalent functionality. Non-standard features are the future, and suggesting that they are somehow “bad” is to work against your own self-interest.
One of the stated reasons for the Lonely Planet sale to the BBC earlier this year was to expand the "digital" aspect of the business, an area where they had so far failed to leverage their reputation. For digital we really have to read online. After all, the Thorn Tree is great, but it's really just a forum. The recent story about technology issues at the BBC (via) make one question whether it was the right partner for online innovation. Here's Tony and Maureen Wheeler talking about the sale.
People have been freaking out about the virtuality of data for decades, and you'd think we'd have internalized the obvious truth: there is no shelf. In the digital world, there is no physical constraint that's forcing this kind of organization on us any longer. We can do without it, and you'd think we'd have learned that lesson by now.
By itself, a web page that lets you send SMS messages or make phone calls isn't really all that exciting. What makes the Mojo stuff cool is that it's all accessible with RESTful web APIs. If you can drive an http: connection, you can send messages and make calls.
After their successful work on HTML5, CSS5, XML5, SVG5, and Web5, the WHATWG has announced that it has started work on a new version of the Bible, to be called "Bible5".
If your web application hosts any valuable information at all, it’s prudent to expect that some significant proportion of your users will eventually have their accounts hijacked.