14 January 2010

About the hreview microformat

While looking for information on support for atom blog posting, I came across the hreview microformat. It is an XML microformat. That basically means you can treat it as a small, optional part of XML with specialized semantics. It is a general format for reviews (of restaurants, movies, products, anything)

What I like, and my vision

I like the idea of specifying a general purpose review format and the decision to use a microformat for it. I like it because I have a vision about reviewing functionality in software.

  • When something you can review is fresh in your mind, you should be able to rate it immediately and publish that review. For instance:
    • In your web browser as you are surfing a site.
    • When a song or album is playing, or finishes playing, or you stop a song (maybe you hated it) or repeat one.
    • In a video game, especially for multiple sources of content, for instance user-generated content.
  • Applications shouldn't have to roll their own functionality for this. They should fill in a few obvious fields and hand the work off to a library, or a plug-in, or an external program.
  • Casual reviewers shouldn't have to choose between learning some site's review format and limiting their review's distribution. This is something the hReview format helps make possible.
  • Since others, like you, can review at the push of a button and their reviews share a common format,
    • Tons of reviews are available to you without really digging for them. You don't have to keep track of what review sites exist for each type of thing.
    • Those reviews are easy to search
    • Those reviews can be collectively summarized.

What I don't like

What I don't like are the specifics of hReview. These fields have problems, in my opinion:

type
Type has to be "one of the following: product, business, event, person, place, website, url".
  • The categories are too broad. Suppose I'm looking for reviews of blogging hosts. Seems like everything under "website" qualifies. How can I narrow the search to "blogging hosts" if the categories are this broad?

    One cannot rely on the "item" element to distinguish subtypes. "Item" can be an hCard, which (currently) doesn't allow any subtype entry. Or similarly, an hCalendar event.

  • The categories seem to overlap - either they do overlap or it takes special knowledge to know what really belongs in each. Above, should "website" have been "business", or could a blogging host be either? Or maybe "url"? "Url" is somehow different than "website", but without digging for information, the distinction isn't obvious.
  • The categories also seem like they must include more than their names suggest. Consider the "product" category. It appears to me that the following reviewable things must be surprisingly classed as products:
    • Free software (in the free-speech sense)
    • User-generated content relating to commercial products, for instance video game levels.
    • Text files that are passed around, eg Linux HOWTOs.
version
There are 3 concepts of "version" that one might want to express. Will every implementor of hReview get it right?
  • Version of hreview format. That's what this field is actually supposed to mean.
  • Version of the review. Yes, reviewers might want to add information or even change their minds.
  • Version of the item. Properly belongs somewhere in the "item" field, but even so, this is an error opportunity.
rating
Allowed values are from 1.0 to 5.0. Wrong! Ratings are ordinal scales. They are rankings. They are not intervals or ratios. Five 1.0's are not equal to one 5.0. The difference between 1.0 and 2.0 is not equal to the difference between 4.0 and 5.0. That is a mistake that I see people make over and over. Worse, it tempts people to do crazy things like averaging ratings.

As a user interface issue, I understand the appeal of typing a number and being done. Problem is, the data it generates is a lie. At best, you could theoretically recover the rankings from the numbers, and only if the user's transfer function hasn't drifted over time.

Other than that, good effort.

How I'd fix the problems

Here's how I would fix the problems:

type
There needs to be some concept of subtyping. It could do one of the following:
  • Allow "type" to be a list, which should be a path from a major type (product, etc) down to as detailed a subNtype as one wants to express.
  • Allow a "subtype" field, which can contain the list above except for the first element.
  • Recommend that the "tags" field accept this
version
I would just rename this field more clearly: "hreview-version".
rating
Two approaches, which IMO should both be used:
  • Allow comparative rankings as an alternative to numerical ratings. Comparative rankings would be indicated with respect to another item of the same type. I'd support these possibilities:
    • "better-than Item"
    • "worse-than Item"
    • "same-quality-as Item"
  • When numerical rankings are used, explictly indicate the scale. Here are some possibilities:
    • Do so by indicating one or more anchor points. I'd tentatively recommend 1.0 and 3.0. So allow another field:
      • "set-to N.N Item", where N.N is between 1.0 and 5.0 inclusive and Item is an item. That item must agree with the "type" field.

        For instance, "set-to 1.0 item-X" would mean "the 1.0 rating is set equal to the quality of item-X; for purposes of this rating, no item inferior to it is considered a representative of this genre".

        Canonical anchor items should be found for each genre of interest so that meaningful comparisons can be made. But this may prove to be difficult. What is a reviewer to do if he feels that the canonical 3.0 item is actually worse than the canonical 1.0 item?

    • Alternatively, indicate the scale by reference to another review or body of reviews that use the same scale.
      • "same-scale-as URL"
    • Or indicate the scale by reference to a well-defined collection of reviewable items. This was actually my first thought, but it needed work. The collection needs to be well-defined, otherwise Sturgeon's Law (90% of everything is crud) will stop us. For any genre, there is usually a large set of poor examples that go deservedly unknown.

No comments:

Post a Comment