The Fragile Web

One of the many clever things that clever people can do with the Web is harvest it, aggregate it, classify it etc. Its not just Google that does this sort of thing! Egon Willighagen is one of those clever people. He runs the Chemical blogspace which does all sorts of amazing things with blogs.

He sent me a message recently, saying that unfortunately, he was not able to do any amazing things to my blog, since it was not failsafe any more. Apparently, deep down in the software he was using to harvest the details of my blog, an error along the lines of Bytes: 0xA0 0x0A 0x49 0x74 was causing grief. This is the sort of message that would make most people quake. In this instance, the excellent W3C comes to the rescue. By putting this blog feed into their RSS Validator , one can narrow down the error. It proved to be on a single line of an earlier blog posting. Remove this line, and all becomes well. In fact, if the line was displayed on a regular text editor, one eventually notices that the end of the line (which looks just like a space) might be the suspect. Remove just that one character, and the RSS Validator is (almost perfectly) happy. I hope that Egon will be too now!

But the lesson of this little exercise is that a single character can still bring the whole edifice crashing down (or at least my entire blog). Single characters of course have been notorious in the past. One that springs to mind was a single (white) space, inserted by accident into a line of Fortran code. That space subverted the meaning of the code, which in fact was being used to control the navigation of a spacecraft on its way to Jupiter. Result? The probe missed Jupiter by quite a margin, and the entire cost of the mission was lost (around 1$billion!).

It is also a lesson  in how an individual might operate within the  modern Web.  During the period  1993 to around 2001, most of the content on the  Web was in the form of static  HTML pages. This was written either by hand, or using software tools to do so.  This was scary stuff for most people. Then along came two  social inventions; the Wiki and the  Blog. Each of these hid (most of) the scary  HTML from the user, and allowed pain-free (almost) creation of content.  As time passed, everyone became accustomed to using such tools, and they started to trust them implicitly to produce  valid HTML under the hood. In my case,  I trusted the Blog software (WordPress) to both not produce faulty  HTML,  or at least to detect it if it got in by accident. In this instant, it is more subtle, with an error in the character encoding.  But this is the lesson.  As the skills of olden time (i.e. writing native  HTML) are lost, we will be more and more at the mercy of the modern tools.  Will we even notice the errors, which might propagate out with our name attached?  Or will the software get even smarter and fix the errors before they cause problems?  Will humans become almost entirely redundant?

Tags: , , , , , ,

Leave a Reply