SAX (simple API for XML parsing)
SAX is very fast and needs few memory as it is not required to keep the whole document in memory; however, accessing the data can be somewhat awkward for many applications. It is actually a rather low-level interface, but might be still useful for applications where mostle "linear read" is required and highest performance is the issue. A lot of implementations are available, also as part of the Java API.
DOM (document object model)
- The whole document is kept in memory, i.e., much higher memory consumption than with SAX
- Document is built from generic objects like "Document"; "Element"
- Data access typically is limited to "String-type" access
The drawback however is, that binding frameworks are more complex to understand than DOM libraries and the initial effort (creating schemas, binding definitions, code generation) is somewhat higher.
Castor has the advantage that the framework contains an O/R mapper as well as an XML binding library, hence if both things are needed in a project, Castor might be the right choice. XMLBeans on the other hand got some attention the last years, as it appears to be the most powerful library available.
XML Beans in the recent version does not only support binding, but also has "low-level" XML interfaces named Cursor and Token, that are comparable with DOM libraries. Hence it is possible to work with "different perspectives" on XML data: binding on the one hand, full XML infoset access down to whitespaces and XML coments on the other. Also XQuery and XPath are supported to query XML data. The roadmap of XMLBeans plans streaming XMLBeans to overcome the disadvantage, that full documents have to be kept in memory.
So it appears that the future of XML processing might be hybrid frameworks like XML beans, that allow to change the access strategy as it is needed for the very problem to be solved.