Re: CDATA by any other name... (was The raw and the cooked)

Richard L. Goerwitz III (david@megginson.com)
Fri, 30 Oct 1998 11:23:00 -0500 (EST)

Messages sorted by: [ date ][ thread ][ subject ][ author ]
Next message: Richard L. Goerwitz III: "Re: CDATA by any other name... (was The raw and the cooked)"
Previous message: Richard L. Goerwitz III: "Re: CDATA by any other name... (was The raw and the cooked)"

Henry S. Thompson writes:

> <david@megginson.com> writes:
>
> > So, Henry's asking whether this is valid:
> >
> > <!DOCTYPE a [
> > <!ELEMENT a (b, c)>
> > <!ELEMENT b EMPTY>
> > <!ELEMENT c EMPTY>
> > ]>
> > <a><![CDATA[ ]><b/><c/></a>

And I'll answer my original posting and say that it's not valid
because it's not well-formed -- let's try

<!DOCTYPE a [
<!ELEMENT a (b, c)>
<!ELEMENT b EMPTY>
<!ELEMENT c EMPTY>
]>
<a><![CDATA[ ]]><b/><c/></a>

instead, and continue the discussion from there.

> What he said. The DOM made a serious mistake here in my opinion:
> it's stranded in no-person's-land between raw and cooked, without
> being either. It's not cooked, because it gives you
> EntityReference and CDATA nodes. It's not raw, because it DOESN'T
> give you character entity references.

The DOM level-one core serves two constituencies -- authoring tools
that need to do horizontal transformations (XML=>XML, where the result
replaces the original) and processing/rendering tools that need to do
downstream processing (XML=>XML or XML=>X, where the original remains
unaltered). Horizontal transformations will usually be somewhat
lossy, and the DOM WG has clearly decided that only a few lexical
features were important enough to give a good cost/benefit return on
the effort required to specify and implement them.

However, the point is that a specific DOM tree doesn't *have* to
include nodes for comments, CDATA sections, and entity references --
they are there only to support very specialised applications and
should be stripped out for ordinary XML processing.

All the best,

David

-- David Megginson david@megginson.com http://www.megginson.com/

Next message: Richard L. Goerwitz III: "Re: CDATA by any other name... (was The raw and the cooked)"
Previous message: Richard L. Goerwitz III: "Re: CDATA by any other name... (was The raw and the cooked)"