org.cyberneko.html.filters
public class Purifier extends DefaultFilter
Illegal characters in XML names are converted to the character sequence "_u####_" where "####" is the value of the Unicode character represented in hexadecimal. Whereas illegal characters appearing in document content is converted to the character sequence "\\u####".
In comments, the character '-' is replaced by the character sequence "- " to prevent "--" from ever appearing in the comment content. For CDATA sections, the character ']' is replaced by the character sequence "] " to prevent "]]" from appearing.
The URI used for synthesized namespace bindings is "http://cyberneko.org/html/ns/synthesized/number" where number is generated to ensure uniqueness.
Version: $Id: Purifier.java,v 1.5 2005/02/14 03:56:54 andyc Exp $
| Field Summary | |
|---|---|
| protected static String | AUGMENTATIONS Include infoset augmentations. |
| protected boolean | fAugmentations Augmentations. |
| protected boolean | fInCDATASection True if inside a CDATA section. |
| protected NamespaceContext | fNamespaceContext Namespace information. |
| protected boolean | fNamespaces Namespaces. |
| protected String | fPublicId Public identifier of doctype declaration. |
| protected boolean | fSeenDoctype True if the doctype declaration was seen. |
| protected boolean | fSeenRootElement True if root element was seen. |
| protected int | fSynthesizedNamespaceCount Synthesized namespace binding count. |
| protected String | fSystemId System identifier of doctype declaration. |
| protected static String | NAMESPACES Namespaces. |
| protected static HTMLEventInfo | SYNTHESIZED_ITEM Synthesized event info item. |
| static String | SYNTHESIZED_NAMESPACE_PREFX Synthesized namespace binding prefix. |
| Method Summary | |
|---|---|
| void | characters(XMLString text, Augmentations augs) Characters. |
| void | comment(XMLString text, Augmentations augs) Comment. |
| void | doctypeDecl(String root, String pubid, String sysid, Augmentations augs) Doctype declaration. |
| void | emptyElement(QName element, XMLAttributes attrs, Augmentations augs) Empty element. |
| void | endCDATA(Augmentations augs) End CDATA section. |
| void | endElement(QName element, Augmentations augs) End element. |
| protected void | handleStartDocument() Handle start document. |
| protected void | handleStartElement(QName element, XMLAttributes attrs) Handle start element. |
| void | processingInstruction(String target, XMLString data, Augmentations augs) Processing instruction. |
| protected String | purifyName(String name, boolean localpart) Purify name. |
| protected QName | purifyQName(QName qname) Purify qualified name. |
| protected XMLString | purifyText(XMLString text) Purify content. |
| void | reset(XMLComponentManager manager) |
| void | startCDATA(Augmentations augs) Start CDATA section. |
| void | startDocument(XMLLocator locator, String encoding, Augmentations augs) Start document. |
| void | startDocument(XMLLocator locator, String encoding, NamespaceContext nscontext, Augmentations augs) Start document. |
| void | startElement(QName element, XMLAttributes attrs, Augmentations augs) Start element. |
| protected void | synthesizeBinding(XMLAttributes attrs, String ns) Synthesize namespace binding. |
| protected Augmentations | synthesizedAugs() Returns an augmentations object with a synthesized item added. |
| protected static String | toHexString(int c, int padlen) Returns a padded hexadecimal string for the given value. |
| void | xmlDecl(String version, String encoding, String standalone, Augmentations augs) XML declaration. |