org.compass.core.xml.jdom.converter.support
Class StAXTextModifier

java.lang.Object
  extended by org.compass.core.xml.jdom.converter.support.StAXTextModifier
Direct Known Subclasses:
StAXBuilder.IndentRemover

public abstract class StAXTextModifier
extends Object

Strategy class (used by StAXBuilder that allows for modifying text content when building a JDOM tree from an XML document using StAX XMLStreamReader. It is most commonly used to trim out white space that can not be automatically determined by the parser (due to not having an associated DTD, usually), but can be used to do other manipulations as well.

Basic calling sequence is as follows:

  1. For each START_ELEMENT and END_ELEMENT, allowModificationsAfter(javax.xml.stream.XMLStreamReader, int) is called, to determine if CHARACTERS elements read after this event may possibly be modified. This allows builder to ignore calling other methods on this object for elements that contain text content that should not be modified (like <pre> element in (X)HTML, for example).
  2. For each CHARACTERS element that follows a call to allowModificationsAfter(javax.xml.stream.XMLStreamReader, int) that returned true, possiblyModifyText(javax.xml.stream.XMLStreamReader, int) is called, to determine if contents of that event should be modified before being added to the JDOM tree
  3. Finally, for CHARACTERS event for which call to possiblyModifyText(javax.xml.stream.XMLStreamReader, int) returned true, textToIncludeBetween(javax.xml.stream.XMLStreamReader, int, int, java.lang.String) is called to figure out resulting text to add to JDOM tree. This may be the original text (which is passed as an argument), or something else, including null or empty String to essentially remove that text event from the tree.

The default implementation of this class implements simple logics that will remove all "indentation" white space from the document. This is done by always enabling modifications in the whole tree, and removing such text events that are all whitespace and start with a line feed character (\r or \n). Extending classes can obviously create much more fine-grained heuristics.

Author:
kimchy

Constructor Summary
protected StAXTextModifier()
           
 
Method Summary
abstract  boolean allowModificationsAfter(javax.xml.stream.XMLStreamReader r, int eventType)
          Method called to determine whether to possibly remove (indentation) white space after START_ELEMENT or END_ELEMENT that the stream reader currently points to.
abstract  boolean possiblyModifyText(javax.xml.stream.XMLStreamReader r, int prevEvent)
          Method called for CHARACTERS and CDATA events when the previous call to allowModificationsAfter(javax.xml.stream.XMLStreamReader, int) returned true.
abstract  String textToIncludeBetween(javax.xml.stream.XMLStreamReader r, int prevEvent, int nextEvent, String text)
          Method called to determine what to include in place of the preceding text segment (of type CHARACTERS or CDATA), given event types that precede and follow the text segment.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

StAXTextModifier

protected StAXTextModifier()
Method Detail

allowModificationsAfter

public abstract boolean allowModificationsAfter(javax.xml.stream.XMLStreamReader r,
                                                int eventType)
                                         throws javax.xml.stream.XMLStreamException
Method called to determine whether to possibly remove (indentation) white space after START_ELEMENT or END_ELEMENT that the stream reader currently points to.

Parameters:
r - Stream reader that currently points to the event referred.
eventType - Type of the currently pointed to event (either START_ELEMENT or END_ELEMENT)
Throws:
javax.xml.stream.XMLStreamException

possiblyModifyText

public abstract boolean possiblyModifyText(javax.xml.stream.XMLStreamReader r,
                                           int prevEvent)
                                    throws javax.xml.stream.XMLStreamException
Method called for CHARACTERS and CDATA events when the previous call to allowModificationsAfter(javax.xml.stream.XMLStreamReader, int) returned true. Is used to determine if there is possibility that this text segment needs to be modified (up to and including being removed, as is the case for indentation removal).

Note: StAX stream readers are allowed report CDATA sections as CHARACTERS too, so some implementations may not allow distinguishing between CDATA and other text. Further, when text is to be coalesced, resulting event type will always be CHARACTERS, when segments are combined, even if they all were adjacent CDATA sections.

Parameters:
r - Stream reader that currently points to the CHARACTERS or CDATA event for which method is called.
prevEvent - Type of the event that immediately preceded the current event.
Throws:
javax.xml.stream.XMLStreamException

textToIncludeBetween

public abstract String textToIncludeBetween(javax.xml.stream.XMLStreamReader r,
                                            int prevEvent,
                                            int nextEvent,
                                            String text)
                                     throws javax.xml.stream.XMLStreamException
Method called to determine what to include in place of the preceding text segment (of type CHARACTERS or CDATA), given event types that precede and follow the text segment. This allows for removal of (indentation) white space (return null or empty string); trimming of leading and/or trailing white space (return trimmed text), or just returning passed-in text as is.

The method is only called if the immediately preceding call to possiblyModifyText(javax.xml.stream.XMLStreamReader, int) returned true; otherwise text is included as is without calling this method.

Note that when this method is called, the passed in stream reader already points to the event following the text; not the text itself; because of this the text is passed explicitly, as it can NOT be accessed via the stream reader.

Returns:
Text to include in place of a CHARACTERS event; may be the text passed in (no change), null/empty String (remove the text event, usually all white space like indentation), or a modified String.
Throws:
javax.xml.stream.XMLStreamException


Copyright (c) 2004-2009 The Compass Project.