public class DocumentXmlUtils extends Object
DocumentImpl
but as they are not specific to any one implementation of the
Document
interface they have been moved here.Modifier and Type | Field and Description |
---|---|
static int |
DOC_SIZE_MULTIPLICATION_FACTOR
This field is used when creating StringBuffers for toXml() methods.
|
static Map<Character,String> |
entitiesMap
A map initialized in init() containing entities that needs to be replaced
in strings
|
Constructor and Description |
---|
DocumentXmlUtils() |
Modifier and Type | Method and Description |
---|---|
static void |
annotationSetToXml(AnnotationSet anAnnotationSet,
StringBuffer buffer)
Converts the Annotation set to XML which is appended to the supplied
StringBuffer instance.
|
static void |
annotationSetToXml(AnnotationSet anAnnotationSet,
String annotationSetNameToUse,
StringBuffer buffer)
Converts the Annotation set to XML which is appended to the supplied
StringBuffer instance.
|
static void |
buildEntityMapFromString(String aScanString,
SortedMap<Long,Character> aMapToFill)
This method takes aScanString and searches for those chars from entitiesMap
that appear in the string.
|
static StringBuffer |
combinedNormalisation(String inputString)
Combines replaceCharsWithEntities and filterNonXmlChars in a single method
|
static StringBuffer |
featuresToXml(FeatureMap aFeatureMap,
Map<String,StringBuffer> normalizedFeatureNames)
This method saves a FeatureMap as XML elements.
|
static StringBuffer |
filterNonXmlChars(StringBuffer aStrBuffer)
This method filters any non XML char see:
http://www.w3c.org/TR/2000/REC-xml-20001006#charsets All non XML chars will
be replaced with 0x20 (space char) This assures that the next time the
document is loaded there won't be any problems.
|
static boolean |
isXmlChar(char ch)
This method decide if a char is a valid XML one or not
|
static StringBuffer |
replaceCharsWithEntities(String anInputString)
This method replace all chars that appears in the anInputString and also
that are in the entitiesMap with their corresponding entity
|
static String |
textWithNodes(TextualDocument doc,
String aText)
Returns the document's text interspersed with <Node> elements at all
points where the document has an annotation beginning or ending.
|
static String |
toXml(TextualDocument doc)
Returns a GateXml document that is a custom XML format for wich there is a
reader inside GATE called gate.xml.GateFormatXmlHandler.
|
public static final int DOC_SIZE_MULTIPLICATION_FACTOR
public static String toXml(TextualDocument doc)
doc
- the document to serialize.public static StringBuffer featuresToXml(FeatureMap aFeatureMap, Map<String,StringBuffer> normalizedFeatureNames)
aFeatureMap
- the feature map that has to be saved as XML.public static StringBuffer combinedNormalisation(String inputString)
public static StringBuffer filterNonXmlChars(StringBuffer aStrBuffer)
aStrBuffer
- represents the input String that is filtred. If the aStrBuffer is
null then an empty string will be returendpublic static boolean isXmlChar(char ch)
ch
- the char to be testedpublic static StringBuffer replaceCharsWithEntities(String anInputString)
anInputString
- the string analyzed. If it is null then returns the
empty stringpublic static String textWithNodes(TextualDocument doc, String aText)
public static void buildEntityMapFromString(String aScanString, SortedMap<Long,Character> aMapToFill)
public static void annotationSetToXml(AnnotationSet anAnnotationSet, StringBuffer buffer)
anAnnotationSet
- The annotation set that has to be saved as XML.buffer
- the StringBuffer that the XML representation should be appended topublic static void annotationSetToXml(AnnotationSet anAnnotationSet, String annotationSetNameToUse, StringBuffer buffer)
method
uses the
name that belongs to the provided annotation set, however, this method
allows one to store the provided annotation set under a different
annotation set name.anAnnotationSet
- the annotation set that has to be saved as XML.annotationSetNameToUse
- the new name for the annotation set being converted to XMLbuffer
- the StringBuffer that the XML representation should be appended toCopyright © 2024 GATE. All rights reserved.