Relative urls in blog feeds
Andras Frenyo has brought to my attention the fact that for some feeds in the blog aggregator, links or even images don't work. Andras, as a content producer, does not like his content to be misrepresented, and requested I remove his blog from the aggregator.
The problem is a common problem with 'syndication'. In syndication feeds, such as RSS and Atom, links and images are required to be fully qualified URIs; according to the specifications, all links are supposed to have the whole http://server bit, instead of pointing to a location relative to the current document. The idea is that since the content is presented on another location, the 'current document' is undefined, so relative urls don't exist.
However, most CMS's, Drupal included, directly copy the original post, including relative URLS into their feeds. And this often makes aggregated content 'not work'; relative links in the original post result in relative links in the feed, which when aggregated on another site are no longer valid. This is true not just for the IVRPA site, but also for the 'real' aggregators such as bloglines.com and RSS clients such as netnewswire and newsgator.
What the aggregator does is strictly correct; it just displays content the way it get it. Relative urls are broken. This does not mean that I am happy with those broken links though... I will do my best to fix the issue for our aggregator, though as I have outlined above that might not be an easy task; the relation to the original location the urls are relative to is severed.
I will also see if I can fix the other side of the story; currently Drupal's feeds are as broken as the rest of them, so our own feeds don't work correctly in aggregators either.
A note to content owners: If you don't like how your content is aggregated, let me know and I will remove your content. Be aware though, that the reason your content is broken is because the feed you are producing is broken; your content will not display correctly anywhere it is aggregated, so you might want to rethink if you want to syndicate your content in the first place.
Further reading:
http://weblog.philringnalda.com/2003/04/03/again-with-the-relative-urls
http://simon.incutio.com/archive/2003/04/04/#lettingOffSomeSteam
Update: Since posting the entry above, I have implemented a fix that might just do the trick. Before, most links on the aggregated version of Roger Howard's blog would be broken. Now they work.
The only 'broken' links that I can find in the current set of aggregated blogs are the images in Andrew Hudson Smith' Digital Urban, where the images are purposely blocked because we are not Blogger.com. Really, what's the use of syndication if you're going to block aggregators that way?
Unfortunately, the fix still would not help Andras' Panoramablog, because the issue with his links is another one.

Fixes?
I can see two points to fix this:
1) Fix the Drupal syndication engine; obviously, this should be done to make it more useful to external aggregators which expect absolute URLs - and won't break others which can already deal with relative. Of course, this assumes Drupal sites are the only problem, and that this fix will get to the other site operators.
2) Fix the aggregator to assume a base URL which is can concatenate with any relative URLs it finds in feeds; this doesn't address the root cause, but it fixes things for this site at least.
Either should be pretty easy hacks..
Am only really getting back on my blog horse and have a lot of stuff to fix on my site.
Thanks for syndicating me! :)