Open Graph, or the website extended

The Open Graph protocol recently rolled out by FaceBook introduces a set of metadata properties for pages. When I mentioned Alex Iskold’s assessment of its implications, John Warren quickly pointed out some significant limitations of Open Graph as a metadata format:

  • The page has a single subject.
  • The properties of that subject are a flat list without deep structure or relations.
  • The subject identifier is not controlled.

Recognizing those limitations (thanks) makes it easier to see what’s going on. Open Graph does not take on the challenge of classifying resources for discovery or of publishing datasets. Instead, Open Graph provides a vehicle for delegating maintenance of resource descriptions to resource experts. For FaceBook, the experts include Pandora for music, IMDb for movies, Yelp for reviews, and so on. FaceBook just has to catch the expert definition of a resource when a user clicks on the Like widget.

Both the technical and social engineering here are clever.  Because maintenance is distributed, the approach can scale. That’s reminiscent of the Semantic Web Environment Directory (SWED) and of Google, for that matter (because maintenance of the link data that’s crucial for the Google index is distributed to the authors of web pages). Also, besides making the initial implementation a lot easier, the narrow scope makes for better graph nodes. Allowing metadata about only a single subject encourages the creation of pages with focus — one of the tenets of Information Architecture.

From a Semantic Web perspective, a couple of points stand out.  First, as mentioned previously, Open Graph doesn’t address the typical Semantic Web scenario of publishing public datasets but, instead, augments a private graph with isolated public resource definitions. Second, instead of assigning a unique identifier to a real-world object, Open Graph appears to identify a real-world object indirectly through its association with a web page. That’s pragmatic for web publishers but challenging for processing. Consider the movie GoodBye, Solo, which has web pages on IMDb and Rotten Tomatoes. I’d expect each site to treat its page as the canonical definition of the movie instead of accepting the other site’s claim to primacy. To provide a unifying social object (as explained by Dare Obasanjo) for qualifying the relationships among all of its fans, the movie needs a single object in the graph. Even though the object type is controlled and will be movie for both IMDb and Rotten Tomatoes, the title can and, in this case, does vary:  “Goodbye Solo (2008)” on IMDB and “Goodbye Solo  (2009)” on Rotten Tomatoes. Perhaps FaceBook establishes equivalence through a combination of textual matching and common fans of both web pages or, perhaps, connecting a subset of the real fans via one web page is good enough.

While Open Graph is an open standard, the Like widget that harvests the data and the private graph that integrates the harvested data are the real keys to taking advantage of the Open Graph pages on IMDb and company. Other solutions can still crawl the Open Graph pages, but such solutions will have to come up with their own integration logic and will have to tolerate days or weeks of lag in updates.

Regardless, Open Graph shows a disruptive innovation:  a unified web experience through dynamic integration of pages hosted by multiple organizations.  No small feat.

This entry was posted in Semantic Web. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s