gnu.xml.pipeline
public final class ValidationConsumer extends EventFilter
xmlns*
attributes (rather than omitting either or both). At this writing, the major SAX2 parsers (such as Ælfred2, Crimson, and Xerces) meet these requirements, and this validation module is used by the optional Ælfred2 validation support.
Note that because this is a layered validator, it has to duplicate some work that the parser is doing; there are also other cost to layering. However, because of layering it doesn't need a parser in order to work! You can use it with anything that generates SAX events, such as an application component that wants to detect invalid content in a changed area without validating an entire document, or which wants to ensure that it doesn't write invalid data to a communications partner.
Also, note that because this is a layered validator, the line numbers reported for some errors may seem strange. For example, if an element does not permit character content, the validator will use the locator provided to it. That might reflect the last character of a characters event callback, rather than the first non-whitespace character.
Current limitations of the validation performed are in roughly three categories.
The first category represents constraints which demand violations of software layering: exposing lexical details, one of the first things that application programming interfaces (APIs) hide. These invariably relate to XML entity handling, and to historical oddities of the XML validation semantics. Curiously, recent (Autumn 1999) conformance testing showed that these constraints are among those handled worst by existing XML validating parsers. Arguments have been made that each of these VCs should be turned into WFCs (most of them) or discarded (popular for the standalone declaration); in short, that these are bugs in the XML specification (not all via SGML):
The second category of limitations on this validation represent constraints associated with information that is not guaranteed to be available (or in one case, is guaranteed not to be available, through the SAX2 API:
A third category relates to ease of implementation. (Think of this as "bugs".) The most notable issue here is character handling. Rather than attempting to implement the voluminous character tables in the XML specification (Appendix B), Unicode rules are used directly from the java.lang.Character class. Recent JVMs have begun to diverge from the original specification for that class (Unicode 2.0), meaning that different JVMs may handle that aspect of conformance differently.
Note that for some of the validity errors that SAX2 does not expose, a nonvalidating parser is permitted (by the XML specification) to report validity errors. When used with a parser that does so for the validity constraints mentioned above (or any other SAX2 event stream producer that does the same thing), overall conformance is substantially improved.
Version: $Date: 2001/11/09 22:53:17 $
See Also: SAXDriver
Constructor Summary | |
---|---|
ValidationConsumer()
Creates a pipeline terminus which consumes all events passed to
it; this will report validity errors as if they were fatal errors,
unless an error handler is assigned.
| |
ValidationConsumer(EventConsumer next)
Creates a pipeline filter which reports validity errors and then
passes events on to the next consumer if they were not fatal.
| |
ValidationConsumer(String rootName, String publicId, String systemId, String internalSubset, EntityResolver resolver, String minimalDocument)
Creates a validation consumer which is preloaded with the DTD provided.
|
Method Summary | |
---|---|
void | attributeDecl(String eName, String aName, String type, String mode, String value)
DecllHandler Records attribute declaration for later use
in validating document content, and checks validity constraints
that are applicable to attribute declarations.
|
void | characters(char[] ch, int start, int length)
ContentHandler Reports a validity error if the element's content
model does not permit character data.
|
void | elementDecl(String name, String model)
DecllHandler Records the element declaration for later use
when checking document content, and checks validity constraints that
apply to element declarations. |
void | endDocument()
ContentHandler Checks whether all ID values that were
referenced have been declared, and releases all resources.
|
void | endDTD()
LexicalHandler Verifies that all referenced notations
and unparsed entities have been declared.
|
void | endElement(String uri, String localName, String qName)
ContentHandler Reports a validity error if the element's content
model does not permit end-of-element yet, or a well formedness error
if there was no matching startElement call.
|
void | externalEntityDecl(String name, String publicId, String systemId)
DecllHandler passed to the next consumer, unless this
one was preloaded with a particular DTD |
void | internalEntityDecl(String name, String value)
DecllHandler passed to the next consumer, unless this
one was preloaded with a particular DTD |
void | notationDecl(String name, String publicId, String systemId)
DTDHandler Records the notation name, for checking
NOTATIONS attribute values and declararations of unparsed
entities. |
void | skippedEntity(String name)
ContentHandler Reports a fatal exception. |
void | startDocument()
ContentHandler Ensures that state from any previous parse
has been deleted.
|
void | startDTD(String name, String publicId, String systemId)
LexicalHandler Records the declaration of the root
element, so it can be verified later.
|
void | startElement(String uri, String localName, String qName, Attributes atts)
ContentHandler Performs validity checks against element
(and document) content models, and attribute values.
|
void | unparsedEntityDecl(String name, String publicId, String systemId, String notationName)
DTDHandler Records the entity name, for checking
ENTITY and ENTITIES attribute values; records the notation
name if it hasn't yet been declared. |
See Also: ValidationConsumer
See Also: ValidationConsumer
The resulting validation consumer will only validate against the specified DTD, regardless of whether some other DTD is found in a document being parsed.
Parameters: rootName The name of the required root element; if this is null, any root element name will be accepted. publicId If non-null and there is a non-null systemId, this identifier provides an alternate access identifier for the DTD's external subset. systemId If non-null, this is a URI (normally URL) that may be used to access the DTD's external subset. internalSubset If non-null, holds literal markup declarations comprising the DTD's internal subset. resolver If non-null, this will be provided to the parser for use when resolving parameter entities (including any external subset). resolver If non-null, this will be provided to the parser for use when resolving parameter entities (including any external subset). minimalElement If non-null, a minimal valid document.
Throws: SAXNotSupportedException If the default SAX parser does not support the standard lexical or declaration handlers. SAXParseException If the specified DTD has either well-formedness or validity errors IOException If the specified DTD can't be read for some reason
See Also: ValidationConsumer
Source code is under GPL (with library exception) in the JAXP project at http://www.gnu.org/software/classpathx/jaxp
This documentation was derived from that source code on 2011-08-26.