Only one real difference - it now does Unicode. It reads the BOM and thus
UCS-2/UTF-16 (even byte-swaps); if there's no BOM, reads and tries to
use the encoding declaration, boots it if it says anything but "UTF-8" or
"UTF8". Successfully parses Murata-san's translation of the XML
spec, would love to get my hands on some more internationalized
XML; in particular with non-ASCII markup.
Another 6K of .class files for I18n, sigh.
Lots of bug-fixes in the event-stream module. I had to write a
significant event-stream Lark application to pull the character classes
out of the XML spec in order to build the CharClasses.java file, and
ran across a few bodacious bugs in end-tag handling.
It's a bit bogus because it really doesn't do UTF-8 yet, just ASCII
masquerading as such. UTF-8 Real Soon Now.
Cheers, Tim Bray
tbray@textuality.com http://www.textuality.com/ +1-604-708-9592