Semantically rich molecules

Peter Murray-Rust in his blog asks for examples of the Scientific Semantic Web, a topic we have both been banging on about for ten years or more (DOI: 10.1021/ci000406v). What we are seeking of course is an example of how scientific connections have been made using inference logic from semantically rich statements to be found on the Web (ideally connections that might not have previously been spotted by humans, and lie overlooked and unloved in the scientific literature). Its a tough cookie, and I look forward to the examples that Peter identifies. Meanwhile, I thought I might share here a semantically rich molecule. OK, I identified this as such not by using the Web, but as someone who is in the process of delivering an undergraduate lecture course on the topic of conformational analysis. This course takes the form of presenting a set of rules or principles which relate to the conformations of molecules, and which themselves derive from quantum mechanics, and then illustrating them with selected annotated examples. To do this, a great many semantic connections have to be made, and in the current state of play, only a human can really hope to make most of these. We really look to the semantic web as it currently is to perhaps spot a few connections that might have been overlooked in this process. So, below is a molecule, and I have made a few semantic connections for it (but have not actually fully formalised them in this blog; that is a different topic I might return to at some time). I feel in my bones that more connections could be made, and offer the molecule here as the fuse!

Two chair conformations of the molecule DULSAE. Click here for 3D. Note the (attractive) short H...H contacts.

To list all the likely semantics that a chemist would perceive in the graphic above would take far too long (by the time one would have finished, a text book would have been written). So here is a very very short summary in the context of conformational analysis.

  1. The molecule has a six membered ring as its backbone
  2. which can adopt two possible chair conformations
  3. which can interconvert by exchanging the axial and equatorial group pair for each of the four carbon atoms in the ring.
  4. An organic chemist will immediately notice a very unusual group, Fe(CO)2Cp, which itself is a semantic goldmine,
  5. but for the purposes here we will regard merely as a C-Fe bond!

The (semantic) question to be posed is “which of the two conformations shown above is the most stable“? That too of course has an abundance of implicit semantics, but most human chemists will probably know that this refers to asking which of the two geometries represents the lowest thermodynamic free energy (and we leave aside the issue of what medium the molecule is in, i.e. solid, solution or gas). A far trickier question is “why”?

So to (some interim) answers. Well, a ωB97XD/6-311G(d) calculation (wow, think of what is implied in that concise notation) predicts conformation (a) to be more stable by 2.3 kcal/mol (2.1 in ΔG, see DOI: 10042/to-4911). Now to the why. What connections would someone well versed in conformation analysis spot?

  1. The molecule has two methyl groups on adjacent atoms. They may prefer to be di-axial rather than di-equatorial to avoid excessive steric repulsions (whatever we mean by that!). That might prefer (b).
  2. The molecule has one carbon with both a cyano and an ether linkage. Well, that is susceptible to an anomeric effect (although, as I pointed out in an earlier post here, this connection has in fact often NOT been made in the literature). Only in conformation (a) is one of the oxygen lone pairs aligned anti-periplanar to the axis of the C-CN bond. The reasons why this is important are outlined in my Lecture course.
  3. Having spotted the last, the human might ask whether there is any possibility of an anomeric effect between an oxygen lone pair and the axis of the C-Fe bond? Well, I rather think that not a single human ever has asked that question! (I cannot know that of course, and perhaps someone has speculated upon this in the literature; this is where a full semantic web would help. That question could be posed of it! The reason  I suspect the connection might not have been made is that the anomeric effect is the domain of the organic chemistry, and  C-Fe bonds are those of the organometallic chemist. They do tend to see the chemical world rather differently, these two groups of chemists). If there was such an effect, it would favour (a).
  4. Then we have an X-C-C-Y motif. Depending on the nature of X and Y, the molecule might actually prefer a gauche conformation, i.e the dihedral angle XCCY would be around 60°. There are several such motifs one can detect; X=Y=O (twice). It might be that other permutations such as X=CN, Y=Fe(CO)2Cp, favour anti-periplanar. There are other permutations whose orientational preference may not even be recorded (in text books). Suddenly its gotten complicated!
  5. There are a number of short (~2.4Å) H…H contacts
  6. We are starting to understand that to unravel the conformation of this molecule, one may have to identify quite a number of different “rules”, and then to quantify each, and add up the numbers to get the final result. That energy of 2.3 kcal/mol may be composed of the result of applying quite a number of different rules. Hence the title of this post, a semantically rich molecule!

Well, I will leave it here for this post, without giving answers to the six points listed above, or really answering my main question posed above. That would make the post too complex (but I will follow this up!). I do want to end by planting the idea that answering this question involves making a great many chemical connections about the properties of this molecule, and then identifying quantitative ways (algorithms) in which an answer can be formulated. The molecule above is presented as a challenge for the Semantic Web to address!

Tags: , , , , , , , , , , , , , ,

2 Responses to “Semantically rich molecules”

  1. Andrew White says:

    It is curious how different chemists view the same molecule. Looking at DULSAE I instantly focused on the metal, and rather than seeing the Fe(CO)2Cp group as just another substituent on the ring my first instinct was to view the ring as a ligand bound to the metal. (Coming from a coordination chemistry background this is not especially surprising.) The second instinct was “that is a sterically big Fe unit, it must want to go equatorial”, so I was interested to see that the axial form was the energy minimum and that the desire for the large group to go equatorial got no mention in your quick run down of some of the competing factors.

    • Henry Rzepa says:

      The “large group equatorial” is one of the more veritable rules, dating back to Barton himself I suspect (or earlier). As many rules of thumb tend to after a long time, it has assumed a sort of infallible status. If you check the 3D x-ray structure in the post, you will find a close H…H interaction between the Cp ring and the main ring, with the large group in the axial position. One presumes this might be (mildly) attractive. I have not (yet) checked if this interaction, or better, can also be achieved with the Fe group equatorial.

Leave a Reply