April 15, 2003

Experiences with the RSS Schema

After some more discussion behind the scenes with Gudge, I have got a working Schema for RSS 2.0 which feels like we have got to "Second Base" at least:
http://www.thearchitect.co.uk/schemas/rss-2_0.xsd

As it stands, I have had to use a repeating <choice> entry for the core channel and item fields, which pretty much completely destroys the cardinality rules built in the the schema definitons. The construct I really wanted to use was an <all> which fits the bill perfectly, but that cannot be used in any nested constructs and therefore does not fit in with the required namespaced extensibility requirements for all modern RSS feed data.

All the RSS 2.0 feeds I could lay my hands on so far validated OK, including those from:
Don Box Tim Ewald Martin Gudgin Yasser Shohoud Chris Anderson Sam Ruby Jon Udell Dave Winer James Strachan and mine.

This whole experience was rather disheartening overall though - XML Schema 1.0 simply cannot represent the structural constraints necessary to describe RSS 2.0 data fully and completely, and I lost some important fidelity with the eventual solution.
The other think I hit upon (several times!) was just how bad some of the popular XML / Schema tools are!
In the end I had to run the schema and sample data through a validating parser just to check the XML Schema definition was actually valid and legal!
Not a very good state-of-the-art, and there is still a lot of opportunity for good XML/Schema tools in the market place I think.

I think we are also going to have to give much more weight to machine-readable descriptions and validation when creating new data formats in the future. Designing at the XML level can create some fairly unusable formats for modern tools built around XML Schema.

So, please exercise this schema, and let me know of any problems.

Then all we need to do next is see what something like the Microsoft InfoPath tools make of this schema - now where did I put the bat-phone number for the Box Cave.... ;-)

Entry categories: RSS Standards Weblog XML
Posted by Jorgen Thelin at April 15, 2003 09:11 PM - [PermaLink]