finally a bnode with a uri

Posts tagged with: structured blogging

A Comparison of Microformats, eRDF, and RDFa

An updated (and customizable) comparison of the different approaches for semantically enhancing HTML.
Update (2006-02-13): In order to avoid further flame wars with RDFa folks, I've adjusted the form to not show my personal priorities as default settings anymore (here they are if you are interested, it's a 48-42-40 ranking for MFs, eRDF, and RDFa respectively). All features are set to "Nice to have" now. As you can see, for these settings, RDFa gets the highest ranking (I *said* the comparison is not biased against RDFa!). If you disable the features related to domain-independent resource descriptions, MFs shine, if you insist on HTML validity, eRDF moves up, etc. It's all in the mix.

After a comment of mine on the Microformats IRC channel, SWD's Michael Hausenblas asks for the reason why I said that I personally don't like RDFa. Damn public logs ;) OK, now I have to justify that somehow without falling into rant mode again...

I already wrote a little comparison of Microformats, Structured Blogging, eRDF, and RDFa some time ago, sounds like a good opportunity to see how things evolved during the last 8 months. Back then I concluded that both eRDF and RDFa were preferred candidates for SemSol, but that RDFa lacked the necessary deployment potential due to not being valid HTML (as far as any widespread HTML spec is concerned).

I excluded the Structured Blogging initiative from this comparison, it seems to have died a silent death. (Their approach to redundantly embed microcontent in script tags apparently didn't convince the developer community.) I also excluded features which are equally available in all approaches, such as visible metadata, general support for plain literals, being well-formed, no negative effect on browser behaviour, etc.

Pretending to be constructive, and in order to make things less biased, I embedded a dynamic page item that allows you to create your own, tailored comparison. The default results reflect my personal requirements (and hopefully answer Michael's question). As your mileage does most probably vary, you can just tweak the feature priorities (The different results are not stored, but the custom comparisons can be bookmarked). Feel free to leave a comment if you'd like me to add more criteria.

No. Feature or Requirement Priority MFs eRDF RDFa
1 DRY (Don't Repeat Yourself) yes yes mostly
2 HTML4 / XHTML 1.0 validity yes yes no
3 Custom extensions / Vocabulary mixing no yes yes
4 Arbitrary resource descriptions no yes yes
5 Explicit syntactic means for arbitrary resource descriptions no no yes
6 Supported by the W3C partly partly yes
7 Follow DCMI guidelines no yes no
8 Stable/Uniform syntax specification partly yes yes
9 Predictable RDF mappings mostly yes yes
10 Live/Web Clipboard Compatibility yes mostly mostly
11 Reliable copying, aggregation, and re-publishing of source chunks. (Self-containment) mostly partly partly
12 Support for not just plain literals (e.g. typed dates, floats, or markup). yes no yes
13 Triple bloat prevention (only actively marked-up information leads to triples) yes yes no
14 Possible integration in namespaced (non-HTML) XML languages. no no yes
15 Mainstream Web developers are already adopting it. yes no no
16 Tidy-safety (Cleaning up the page will never alter the embedded semantics) yes yes no
17 Explicit support for blank nodes. no no yes
18 Compact syntax, based on existing HTML semantics like the address tag or rel/rev/class attributes. yes mostly partly
19 Inclusion of newly evolving publishing patterns (e.g. rel="nofollow"). yes no partly
20 Support for head section metadata such as OpenID or Feed hooks. no partly partly

Results

Solution Points Missing Requirements
RDFa 35 -
eRDF 34 -
Microformats 33 -

Max. points for selected criteria: 60

Summary:

Your requirements are met by RDFa, or eRDF, or Microformats.

Feature notes/explanations:

DRY (Don't Repeat Yourself)
  • RDFa: Literals have to be redundantly put in "content" attributes in order to make them un-typed.
HTML4 / XHTML 1.0 validity
  • RDFa: Given the buzz around the WHATWG, it's uncertain when (if at all) XHTML 2 or XHTML 1.1 modules will be widely deployed enough.
Explicit syntactic means for arbitrary resource descriptions
  • eRDF: owl:sameAs statements (or other IFPs) have to be used to describe external resources.
Supported by the W3C
  • MFs, eRDF: Indirectly supported by W3C's GRDDL effort.
Stable/Uniform syntax specification
  • MFs: Although MFs reuse HTML structures, the format syntax layered on top differs, so that each MF needs separate (though stable) parsing rules.
Predictable RDF mappings
  • MFs: Microformats could be mapped to different RDF structures, but the GRDDL WG will probably recommend fixed mappings.
Live/Web Clipboard Compatibility
  • eRDF, RDFa: Tweaks are needed to make them Live-Clipboard compatible.
Reliable copying, aggregation, and re-publishing of source chunks. (Self-containment)
  • MFs: Some Microformats (e.g. XFN) lose their intended semantics when regarded out of context.
  • eRDF/RDFa: Only chunks with nearby/embedded namespace definitions can be reliably copied.
Support for head section metadata such as OpenID or Feed hooks.
  • eRDF: Can support openID hooks.
  • RDFa: Will probably interpret any rel attribute.


Bottom line: For many requirement combinations a single solution alone is not enough. My tailored summary suggests for example that I should be fine with a combination of Microformats and eRDF. How does your preferred solution mix look like?

ARC Embedded RDF (eRDF) Parser for PHP

Announcing eRDF support for ARC + an eRDF/RDFa comparison
Update: The current RDFa primer is *not* broken wrt to WebArch, the examples were fixed two weeks ago. I've also removed the "no developer support" rant, just received personal support ;-)

While searching for a suitable output format for a new RDF framework, I've been looking at the various semantic hypertext approaches, namely microformats, Structured Blogging, RDFa, and Embedded RDF (eRDF). Each one has its pros and cons:

Microformats:
  • (+) widest deployment so far
  • (+) integrate nicely with current HTML and CSS
  • (-) centralized project, inventing custom microformats is discouraged
  • (-) don't scale, the number of MFs will either be very limited, or sooner or later there will be class name collisions

Structured Blogging:
  • (+) a large number of supporters (at least potentially, the supporters list is huge, although this doesn't represent the available tools)
  • (+) not a competitor, but a superset of microformats
  • (-) the metadata is embedded in a rather odd way
  • (-) the metadata is repeated
  • (-) the use cases are limited (e.g. reviews, events, etc)

RDFa:
  • (+) follows certain microformats principles (e.g. "Don't repeat yourself")
  • (+) freely extensible
  • (+) All resource descriptions (e.g. for events, profiles, products, etc.) can be extracted with a single transformation script
  • (+) RDF-focused
  • (+) W3C-supported
  • (-) Not XHMTL 1.0 compliant, it will take some time before it can be used in commercial products or picky geek circles
  • (-) The default datatype of literals is rdf:XMLLiteral which is wrong for most deployed properties

eRDF:
  • (+) follows the microformats principles
  • (+) freely extensible
  • (+) All resource descriptions (e.g. for events, profiles, products, etc.) can be extracted with a single transformation script
  • (+) uses existing markup
  • (+) XHTML 1.0 compliant
  • (+) RDF-focused
  • (-) Covers only a subset of RDF
  • (-) Does not support XML literals

So, both RDFa and eRDF seem like good candidates for embedding resource descriptions in HTML. The two are not really compatible, though, it is not easily possible to create a superset which is both RDFa and eRDF. However, my publishing framework is using a Wiki-like markup language (M4SH) which is converted to HTML, so I can add support for both approaches and make the output a configuration option. Maybe it's even possible to create a merged serialization without confusing transformers.

I'll surely have another look at RDFa when there is better deployment potential. For now, I've created a M4SH-to-eRDF converter (which is going to be available as part of the forthcoming SemSol framework), and an eRDF parser that can generate RDF/XML from embedded RDF. I've also added some extensions to work around (plain) eRDF's limitations, the main one being on-the-fly rewriting of owl:sameAs assertions to allow full descriptions of remote resources, e.g.
<div id="arc">
  <a rel="owl-sameAs" href="http://example.com/r/001#001"></a>
  <a rel="doap-maintainer" href="#ben">Benjamin</a>
</div>
is automatically converted to
<http://example.com/r/001#001> doap:maintainer <#ben>

The parser can be downloaded at the ARC site (documentation).
I've also put up a little demo service if you want to test the parser.

Archives/Search

YYYY or YYYY/MM
No Posts found

Feeds