Using a Glossary to Unwind Comments from Links
Automating WebLogs that are more than a list of links presents a challenge when representing them in XML. One way to solve the problem is to unentangle links from narrative in the XML representation.
I got a note from Matt Haughey this evening:
"I'd like to wrap up my MetaFilter (metafilter.com) weblog in XML for syndicated version a few people have been asking about. Your DTD looks pretty complete. But I often have several links per entry, what would you suggest for that? A new element called "extralink" or something like that?"
The problem Haughey's describing is an entry like this:
Sunday, November 14, 1999
Accipe sacrificium confessionum mearum de manu linguae meae, quam formasti et excitasti, ut confiteatur nomini tuo, et sana omnia ossa mea, et dicant: domine, quis similis tibi? neque enim docet te, quid in se agatur, qui tibi confitetur; quia oculum tuum non excludit cor clausum, nec manum tuam repellit duritia hominum: sed solvis eam, cum voles, aut miserans aut vindicans, et non est qui se abscondat a calore tuo.*
There are multiple links within what might be construed as a single entry. How should we represent this in XML?
<?xml version="1.0"?> <weblog> <entry> <date>1999-11-14</date> <title>lorem ipsum</title> <url>http://www.foo.bar/baz</url> <description>Accipe sacrificium confessionum mearum de manu linguae meae, quam formasti et excitasti, ut confiteatur nomini tuo, et sana omnia ossa mea, et dicant: domine, quis similis tibi? neque enim docet te, quid in se agatur, qui tibi confitetur; <a href="http://sim.sala.bim/">quia oculum tuum</a> non excludit cor clausum, nec manum tuam repellit duritia hominum: sed solvis eam, cum voles, aut miserans aut vindicans, et non est qui se abscondat a calore tuo.</description> <linktext>ut confiteatur nomini tuo</linktext> </entry> ... </weblog>
This is what I currently do. I don't like this because my XML format is non-standard and I'm embedding a link inside of the description and losing that information.
I'd rather use a standard format such as RSS (which I'm already doing for syndication,) but RSS is not amenable to the writerly style of comments many WebLogs use. Instead of adding to an existing DTD, the right approach may be to split the WebLog into two XML documents: one with the narrative, another with the links.
Example
Think of the WebLog as two documents: a narrative, and a list of links.
|
|
To join the two documents, we can use the Frontier concept of a glossary. The catalog of links in the WebLog's content manager is the glossary. To jump to the resource specified in the glossary, use an indirection.
Suppose Link One above is id 749 in the link database. Then in the narrative, we could represent the link as: <a href="redirect.php3?id=749">. The redirect.php3 script takes id=749 as an argument, looks up the corresponding URL in the database and redirects the browser to that location.
The XML version of the narrative could be:
<?xml version="1.0"?> <weblog> <entry> <date>1999-11-14</date> <text><p>Accipe sacrificium confessionum mearum de manu linguae meae, quam formasti et excitasti, <link id="749">ut confiteatur nomini tuo</link>, et sana omnia ossa mea, et dicant: domine, <link id="750">quis similis tibi</link>? neque enim docet te, quid in se agatur, qui tibi confitetur; quia oculum tuum non excludit cor clausum, nec manum tuam repellit duritia hominum: sed solvis eam, cum voles, aut miserans aut vindicans, et non est qui se abscondat a calore tuo.</p></text> <entry> ... </weblog>
The content management system would parse the link element and construct the elements <a href="redirect.php3?id=749">...</a> and <a href="redirect.php3?id=750">...</a> when rendering as HTML.
A separate interface could produce the links in an RSS file independent of the WebLog's XML representation.
If you weren't concerned with the XML presentation of the narrative, you could use a tool like Blogger to manage the narrative, and include it into the weblog's page as a server-side include. You would be responsible for hand-coding the redirect.php3 references.
The intermingling of links and narrative makes a simple XML representation of a WebLog's content difficult. I've simplified the problem by refactoring the WebLog's XML representation as two documents. I'll be experimenting with this approach in future WebLog automation.
© 1999, Bill Humphries
Thanks to LemonYellow for the idea of using "Augustine's Confessions" instead of lorem ipsum for filler text.