A Semantic blog is one in which the system at least in part understands about (some of the) concepts and topics that are in the content. The idea is that this content can be more intelligently (is that the correct word?) and importantly, automatically searched, harvested, and connected to the same or similar concepts found elsewhere in other blogs and the Web as whole. I am writing this blog using Firefox, having added a Firefox extension called Zemanta. As I write, the system offers suggestions for similar themes elsewhere that I could choose to link to the blog (and obviously the more one writes, or the more specific the terms one uses, the more sensible the suggestions become. At this precise moment, it is still offering fairly generic suggestions, one of which I have just chosen to add). My purpose in this particular post is to explore how the very process of writing a blog might be affected by such a product. I am also inferring (but cannot add detail at the moment) that all the (semantic) connections or links to other materials will be expressed in this blog using some form of formal declaration, such as e.g. RDF or RDFa.

Thus this blog has a WordPress plugin called wp-RDFa as part of its library. This gathers meta-data in two forms, FOAF and Dublin-Core, and expresses it using the RDFa formalism. This is really just a standard way of letting any software that might visit the blog know that this meta-data is available for harvesting. FOAF is something we discussed a year or so back; it is a formal way of expressing information about yourself in RDF (see an ACS talk on the topic), and in particular indicating what you are interested in (as a chemist in my case), who you collaborate with, where you visit (information of course that you do wish to make public, you do not have to include any private details). Nowadays, a variety of social networking tools have become semantically enabled. This blog is, a flavour of Wikis (SemediaWiki, and its potential as a format for science journals), Second Life and many others. At the moment, there is little apparent added value emerging from such enrichment (I have just noted another two Zemanta articles flagged, which I will add at this instant) and certainly little in chemistry.

But what could one aspire to? For example, Steve Bachrach on his blog routinely adds InChI identifiers and keys to uniquely identify all molecules mentioned on his site. Just imagine a situation where one is describing a molecule in one’s own blog, and e.g. Zemanta instantly flags up any other article out there which has tagged the same molecule. That article and your blog can now be semantically identified as talking about the same system. A harvester could collect the information about this molecule, and create a superset of information about it (hey, we chemists already have such a system, it is called Chemical Abstracts! But of course its not quite the same, and I had better reserve a comparison with CAS for another post), which in turn enriches resources such as Zemanta. Its a sort of positive feed-back loop!

Well, the Semantic Web has been a long time coming (see DOI: or 10.1087/095315101750240421 which were both written in 2001), and since it has not yet changed the Web, some tend to write it off as a lost cause. Perhaps the semantification of blogs will make a difference?

  1. Ok – I’m game! What do i (and others!) need to do to get the InChIs harvested?

  2. Henry Rzepa says:

    I am in touch with the ArchivePress people, who are intent on doing this. Also, Egon Willighagen with Chemical blogspace should already be harvesting these things?

    He used to link to my blog, but apparently a rogue character got into it at one stage, and so he stopped harvesting it. I know that he harvests the Bachrach blog (since your most recent post there has now appeared). Perhaps he is also harvesting your InChIs Steve?

