public class Utils extends Object
import static gate.Utils.*
to access these methods without
having to qualify them with a class name. In Groovy code, this class can be
used as a category to inject each utility method into the class of its first
argument, e.g.
Document doc = // ... Annotation ann = // ... use(gate.Utils) { println "Annotation has ${ann.length()} characters" println "and covers the string \"${doc.stringFor(ann)}\"" }
Modifier and Type | Field and Description |
---|---|
static OffsetComparator |
OFFSET_COMPARATOR
A single instance of
OffsetComparator that can be used by any code
that requires one. |
Constructor and Description |
---|
Utils() |
Modifier and Type | Method and Description |
---|---|
static Integer |
addAnn(AnnotationSet outSet,
AnnotationSet spanSet,
String type,
FeatureMap fm)
Add a new annotation to the output annotation set outSet, spanning the same
region as spanSet, and having the given type and feature map.
|
static Integer |
addAnn(AnnotationSet outSet,
Annotation spanAnn,
String type,
FeatureMap fm)
Add a new annotation to the output annotation set outSet, covering the same
region as the annotation spanAnn, and having the given type and feature map.
|
static Integer |
addAnn(AnnotationSet outSet,
long startOffset,
long endOffset,
String type,
FeatureMap fm)
Add a new annotation to the output annotation set outSet, spanning the
given offset range, and having the given type and feature map.
|
static String |
cleanString(String input)
Return a cleaned version of the input String.
|
static String |
cleanStringFor(Document doc,
AnnotationSet anns)
Return the cleaned document text as a String covered by the given annotation set.
|
static String |
cleanStringFor(Document doc,
Long start,
Long end)
Return the cleaned document text between the provided offsets.
|
static String |
cleanStringFor(Document doc,
SimpleAnnotation ann)
Return the cleaned document text as a String corresponding to the annotation.
|
static DocumentContent |
contentFor(SimpleDocument doc,
AnnotationSet anns)
Return the DocumentContent covered by the given annotation set.
|
static DocumentContent |
contentFor(SimpleDocument doc,
SimpleAnnotation ann)
Return the DocumentContent corresponding to the annotation.
|
static Long |
end(AnnotationSet as)
Get the end offset of an annotation set.
|
static Long |
end(SimpleAnnotation a)
Get the end offset of an annotation.
|
static Long |
end(SimpleDocument d)
Get the end offset of a document.
|
static String |
expandUriString(String toExpand,
Map<String,String> prefixes)
Expand both namespace prefixes and base-uris, if possible.
|
static FeatureMap |
featureMap(Object... values)
Create a feature map from an array of values.
|
static AnnotationSet |
getAnnotationsAtOffset(AnnotationSet annotationSet,
Long atOffset)
Return a the subset of annotations from the given annotation set
that start exactly at the given offset.
|
static AnnotationSet |
getAnnotationsEndingAtOffset(AnnotationSet annotationSet,
Long endOffset) |
static AnnotationSet |
getCoextensiveAnnotations(AnnotationSet source,
Annotation coextAnn)
Get all the annotations from the source annotation set that start and end
at exactly the same offsets as the given annotation.
|
static AnnotationSet |
getCoextensiveAnnotations(AnnotationSet source,
AnnotationSet coextSet)
Get all the annotations from the source annotation set that start and end
at exactly the same offsets as the given annotation set.
|
static AnnotationSet |
getCoextensiveAnnotations(AnnotationSet source,
AnnotationSet coextSet,
String type)
Get all the annotations from the source annotation set that start and end
at exactly the same offsets as the given annotation set and are of the
specified type.
|
static AnnotationSet |
getCoextensiveAnnotations(AnnotationSet source,
Annotation coextAnn,
String type)
Get all the annotations from the source annotation set that start and end
at exactly the same offsets as the given annotation and have the specified type.
|
static AnnotationSet |
getContainedAnnotations(AnnotationSet sourceAnnotationSet,
Annotation containingAnnotation)
Get all the annotations from the source annotation set that lie within
the range of the containing annotation.
|
static AnnotationSet |
getContainedAnnotations(AnnotationSet sourceAnnotationSet,
AnnotationSet containingAnnotationSet)
Get all the annotations from the source annotation set that lie within
the range of the containing annotation set, i.e.
|
static AnnotationSet |
getContainedAnnotations(AnnotationSet sourceAnnotationSet,
AnnotationSet containingAnnotationSet,
String targetType)
Get all the annotations from the source annotation set with a type equal to
targetType that lie within
the range of the containing annotation set, i.e.
|
static AnnotationSet |
getContainedAnnotations(AnnotationSet sourceAnnotationSet,
Annotation containingAnnotation,
String targetType)
Get all the annotations of type targetType
from the source annotation set that lie within
the range of the containing annotation.
|
static AnnotationSet |
getCoveringAnnotations(AnnotationSet sourceAnnotationSet,
Annotation coveredAnnotation)
Get all the annotations from the source annotation set that cover
the range of the specified annotation.
|
static AnnotationSet |
getCoveringAnnotations(AnnotationSet sourceAnnotationSet,
AnnotationSet coveredAnnotationSet)
Get all the annotations from the source annotation set that cover
the range of the specified annotation set.
|
static AnnotationSet |
getCoveringAnnotations(AnnotationSet sourceAnnotationSet,
AnnotationSet coveredAnnotationSet,
String targetType)
Get all the annotations from the source annotation set with a type equal to
targetType that cover
the range of the specified annotation set.
|
static AnnotationSet |
getCoveringAnnotations(AnnotationSet sourceAnnotationSet,
Annotation coveredAnnotation,
String targetType)
Get all the annotations of type targetType
from the source annotation set that cover
the range of the specified annotation.
|
static Annotation |
getOnlyAnn(AnnotationSet annset)
Returns the only annotation that annset is expected to contains, throws an
exception if there is not exactly one annotation.
|
static AnnotationSet |
getOverlappingAnnotations(AnnotationSet sourceAnnotationSet,
Annotation overlappedAnnotation)
Get all the annotations from the source annotation set that
partly or totally overlap
the range of the specified annotation.
|
static AnnotationSet |
getOverlappingAnnotations(AnnotationSet sourceAnnotationSet,
AnnotationSet overlappedAnnotationSet)
Get all the annotations from the source annotation set that overlap
the range of the specified annotation set.
|
static AnnotationSet |
getOverlappingAnnotations(AnnotationSet sourceAnnotationSet,
AnnotationSet overlappedAnnotationSet,
String targetType)
Get all the annotations from the source annotation set with a type equal to
targetType that partly or completely overlap the range of the specified
annotation set.
|
static AnnotationSet |
getOverlappingAnnotations(AnnotationSet sourceAnnotationSet,
Annotation overlappedAnnotation,
String targetType)
Get all the annotations of type targetType
from the source annotation set that partly or totally overlap
the range of the specified annotation.
|
static RunningStrategy |
getRunningStrategy(Controller controller,
ProcessingResource pr)
Return the running strategy of the PR in the controller, if the controller
is a conditional controller.
|
static List<Annotation> |
inDocumentOrder(AnnotationSet as)
Return a List containing the annotations in the given annotation set, in
document order (i.e.
|
static AnnotationSet |
intersect(AnnotationSet origSet,
Annotation... others)
Return the subset from the original set that matches one of the given annotations.
|
static AnnotationSet |
intersect(AnnotationSet origSet,
Collection<Annotation> others) |
static boolean |
isEnabled(Controller controller,
ProcessingResource pr)
This method can be used to check if a ProcessingResource has
a chance to be run in the given controller with the current settings.
|
static boolean |
isLoggedOnce(String message)
Check if a message has already been logged or shown.
|
static int |
length(Document doc)
Return the length of the document as an
int -- if the content is too long for an int, the method will throw a
GateRuntimeException.
|
static int |
length(SimpleAnnotation ann)
Return the length of the document content covered by an Annotation as an
int -- if the content is too long for an int, the method will throw
a GateRuntimeException.
|
static long |
lengthLong(Document doc)
Return the length of the document as a long.
|
static long |
lengthLong(SimpleAnnotation ann)
Return the length of the document content covered by an Annotation as a
long.
|
static void |
loadPlugin(File pluginDir)
Load a plugin from the specified directory.
|
static void |
loadPlugin(String dirName)
Deprecated.
|
static void |
logOnce(org.apache.log4j.Logger logger,
org.apache.log4j.Level level,
String message)
Deprecated.
Log4J support will be removed in future, please use SLF4J
|
static void |
logOnce(org.slf4j.Logger logger,
org.slf4j.event.Level level,
String message)
Issue a message to the log but only if the same message has not
been logged already in the same GATE session.
|
static AnnotationSet |
minus(AnnotationSet origSet,
Annotation... except)
Return the given set with the given annotations removed.
|
static AnnotationSet |
minus(AnnotationSet origSet,
Collection<Annotation> exceptions)
Return the given set with the given annotations removed.
|
static AnnotationSet |
plus(AnnotationSet origSet,
Annotation... toAdd)
Return the given set with the given annotations added.
|
static AnnotationSet |
plus(AnnotationSet origSet,
Collection<Annotation> toAdd)
Return the given set with the given annotations added.
|
static String |
replaceVariablesInString(String string,
Object... sources)
This will replace all occurrences of variables of the form $env{name},
$prop{name}, $doc{featname}, $pr_parm{inputAS} or $$env{name} etc in a String.
|
static URL |
resolveURL(String url) |
static URL |
resolveURL(URL url) |
static String |
shortenUriString(String uriString,
Map<String,String> prefixes)
Compact an URI String using base URI and namespace prefixes.
|
static Long |
start(AnnotationSet as)
Get the start offset of an annotation set.
|
static Long |
start(SimpleAnnotation a)
Get the start offset of an annotation.
|
static Long |
start(SimpleDocument d)
Get the start offset of a document (i.e.
|
static String |
stringFor(Document doc,
AnnotationSet anns)
Return the document text as a String covered by the given annotation set.
|
static String |
stringFor(Document doc,
Long start,
Long end)
Returns the document text between the provided offsets.
|
static String |
stringFor(Document doc,
SimpleAnnotation ann)
Return the document text as a String corresponding to the annotation.
|
static FeatureMap |
toFeatureMap(Map<?,?> map)
Create a feature map from an existing map (typically one that does not
itself implement FeatureMap).
|
public static final OffsetComparator OFFSET_COMPARATOR
OffsetComparator
that can be used by any code
that requires one.public static int length(SimpleAnnotation ann)
ann
- the annotation for which to determine the lengthpublic static long lengthLong(SimpleAnnotation ann)
ann
- the annotation for which to determine the lengthpublic static int length(Document doc)
doc
- the document for which to determine the lengthpublic static long lengthLong(Document doc)
doc
- the document for which to determine the lengthpublic static DocumentContent contentFor(SimpleDocument doc, SimpleAnnotation ann)
Note: the DocumentContent object returned will also contain the original content which can be accessed using the getOriginalContent() method.
doc
- the document from which to extract the contentann
- the annotation for which to return the content.public static String stringFor(Document doc, SimpleAnnotation ann)
doc
- the document from which to extract the document textann
- the annotation for which to return the text.public static String cleanStringFor(Document doc, SimpleAnnotation ann)
doc
- the document from which to extract the document textann
- the annotation for which to return the text.public static String stringFor(Document doc, Long start, Long end)
doc
- the document from which to extract the document textstart
- the start offsetend
- the end offsetpublic static String cleanStringFor(Document doc, Long start, Long end)
doc
- the document from which to extract the document textstart
- the start offsetend
- the end offsetpublic static DocumentContent contentFor(SimpleDocument doc, AnnotationSet anns)
Note: the DocumentContent object returned will also contain the original content which can be accessed using the getOriginalContent() method.
doc
- the document from which to extract the contentanns
- the annotation set for which to return the content.public static String stringFor(Document doc, AnnotationSet anns)
doc
- the document from which to extract the document textanns
- the annotation set for which to return the text.public static String cleanStringFor(Document doc, AnnotationSet anns)
doc
- the document from which to extract the document textanns
- the annotation set for which to return the text.public static String cleanString(String input)
public static Long start(SimpleAnnotation a)
public static Long start(AnnotationSet as)
public static Long start(SimpleDocument d)
public static Long end(SimpleAnnotation a)
public static Long end(AnnotationSet as)
public static Long end(SimpleDocument d)
public static AnnotationSet getAnnotationsAtOffset(AnnotationSet annotationSet, Long atOffset)
annotationSet
- the set of annotations from which to selectatOffset
- the offset where the annoation to be returned should startpublic static AnnotationSet getAnnotationsEndingAtOffset(AnnotationSet annotationSet, Long endOffset)
public static AnnotationSet getContainedAnnotations(AnnotationSet sourceAnnotationSet, Annotation containingAnnotation)
sourceAnnotationSet
- the annotation set from which to selectcontainingAnnotation
- the annotation whose range must contain the
selected annotationspublic static AnnotationSet getContainedAnnotations(AnnotationSet sourceAnnotationSet, Annotation containingAnnotation, String targetType)
sourceAnnotationSet
- the annotation set from which to selectcontainingAnnotation
- the annotation whose range must contain thetargetType
- the type the selected annotations must have. If the
empty string, no filtering on type is done.public static AnnotationSet getContainedAnnotations(AnnotationSet sourceAnnotationSet, AnnotationSet containingAnnotationSet)
sourceAnnotationSet
- the annotation set from which to selectcontainingAnnotationSet
- the annotation set whose range must contain
the selected annotationspublic static AnnotationSet getContainedAnnotations(AnnotationSet sourceAnnotationSet, AnnotationSet containingAnnotationSet, String targetType)
sourceAnnotationSet
- the annotation set from which to selectcontainingAnnotationSet
- the annotation set whose range must contain
the selected annotationstargetType
- the type the selected annotations must havepublic static AnnotationSet getCoveringAnnotations(AnnotationSet sourceAnnotationSet, Annotation coveredAnnotation)
sourceAnnotationSet
- the annotation set from which to selectcoveredAnnotation
- the annotation whose range must equal or lie within
the selected annotationspublic static AnnotationSet getCoveringAnnotations(AnnotationSet sourceAnnotationSet, Annotation coveredAnnotation, String targetType)
sourceAnnotationSet
- the annotation set from which to selectcoveredAnnotation
- the annotation whose range must be coveredtargetType
- the type the selected annotations must have. If the
empty string, no filtering on type is done.public static AnnotationSet getCoveringAnnotations(AnnotationSet sourceAnnotationSet, AnnotationSet coveredAnnotationSet)
sourceAnnotationSet
- the annotation set from which to selectcoveredAnnotationSet
- the annotation set whose range must be covered by
the selected annotationspublic static AnnotationSet getCoveringAnnotations(AnnotationSet sourceAnnotationSet, AnnotationSet coveredAnnotationSet, String targetType)
sourceAnnotationSet
- the annotation set from which to selectcoveredAnnotationSet
- the annotation set whose range must
be covered by the selected annotationstargetType
- the type the selected annotations must havepublic static AnnotationSet getOverlappingAnnotations(AnnotationSet sourceAnnotationSet, Annotation overlappedAnnotation)
sourceAnnotationSet
- the annotation set from which to selectoverlappedAnnotation
- the annotation whose range the selected
annotations must overlappublic static AnnotationSet getOverlappingAnnotations(AnnotationSet sourceAnnotationSet, Annotation overlappedAnnotation, String targetType)
sourceAnnotationSet
- the annotation set from which to selectoverlappedAnnotation
- the annotation whose range the selected
annotations must overlaptargetType
- the type the selected annotations must have. If the
empty string, no filtering on type is done.public static AnnotationSet getOverlappingAnnotations(AnnotationSet sourceAnnotationSet, AnnotationSet overlappedAnnotationSet)
sourceAnnotationSet
- the annotation set from which to selectoverlappedAnnotationSet
- the annotation set whose range must
be overlapped by the selected annotationspublic static AnnotationSet getOverlappingAnnotations(AnnotationSet sourceAnnotationSet, AnnotationSet overlappedAnnotationSet, String targetType)
sourceAnnotationSet
- the annotation set from which to selectoverlappedAnnotationSet
- the annotation set whose range must
be overlapped by the selected annotationstargetType
- the type the selected annotations must havepublic static List<Annotation> inDocumentOrder(AnnotationSet as)
as
- the annotation setas
in document
order.public static FeatureMap featureMap(Object... values)
values
- an even number of items, alternating keys and values.public static FeatureMap toFeatureMap(Map<?,?> map)
map
- the map to convert.public static boolean isEnabled(Controller controller, ProcessingResource pr)
That means that for a non-conditional controller, the method will return true if the PR is part of the controller. For a conditional controller, the method will return true if it is part of the controller and at least once (if the same PR is contained multiple times) it is not disabled.
controller
- pr
- public static RunningStrategy getRunningStrategy(Controller controller, ProcessingResource pr)
controller
- pr
- @Deprecated public static void logOnce(org.apache.log4j.Logger logger, org.apache.log4j.Level level, String message)
logger
- - the logger instance to uselevel
- - a Log4J severity level for the messagemessage
- - the message itselfpublic static void logOnce(org.slf4j.Logger logger, org.slf4j.event.Level level, String message)
logger
- - the logger instance to uselevel
- - an SLF4J severity level for the messagemessage
- - the message itselfpublic static boolean isLoggedOnce(String message)
message
- - the message that should only be logged or shown oncepublic static Annotation getOnlyAnn(AnnotationSet annset)
annset
- the annotation set that is expected to contain exactly one annotationpublic static Integer addAnn(AnnotationSet outSet, AnnotationSet spanSet, String type, FeatureMap fm)
outSet
- the annotation set where the new annotation will be addedspanSet
- an annotation set representing the span of the new annotationtype
- the annotation type of the new annotationfm
- the feature map to use for the new annotationpublic static Integer addAnn(AnnotationSet outSet, long startOffset, long endOffset, String type, FeatureMap fm)
outSet
- outSet the annotation set where the new annotation will be addedstartOffset
- the start offset of the new annotationendOffset
- the end offset of the new annotationtype
- the annotation type of the new annotationfm
- the feature map to use for the new annotationpublic static Integer addAnn(AnnotationSet outSet, Annotation spanAnn, String type, FeatureMap fm)
outSet
- the annotation set where the new annotation will be addedspanAnn
- an annotation representing the span of the new annotationtype
- the annotation type of the new annotationfm
- the feature map to use for the new annotationpublic static String expandUriString(String toExpand, Map<String,String> prefixes)
If the map only contains a basename uri (if the only entry is for the empty string key) then name space prefixes are not checked: in this case, the toExpand string may contain an unescaped colon. If the map does not contain a basename URI (if there is no entry for the empty string key) then all toExpand strings are expected to be qNames.
NOTE: the name prefixes in the prefixes map must include the trailing colon!
toExpand
- the URI portion to expand as a Stringprefixes
- a map from name prefixes to URI prefixespublic static String shortenUriString(String uriString, Map<String,String> prefixes)
uriString
- a full URI String that should get shortened using prefix names or a base URIprefixes
- a map containing name prefixes mapped to URI prefixes (same as for expandUriString)public static AnnotationSet getCoextensiveAnnotations(AnnotationSet source, AnnotationSet coextSet)
source
- the annotation set from which to selectcoextSet
- the annotation set from which to take the start and end offsetspublic static AnnotationSet getCoextensiveAnnotations(AnnotationSet source, AnnotationSet coextSet, String type)
source
- the annotation set from which to selectcoextSet
- the annotation set from which to take the start and end offsetstype
- the desired annotation type of the annotations to returnpublic static AnnotationSet getCoextensiveAnnotations(AnnotationSet source, Annotation coextAnn)
source
- the annotation set from which to selectcoextAnn
- the annotation from which to take the start and end offsetspublic static AnnotationSet getCoextensiveAnnotations(AnnotationSet source, Annotation coextAnn, String type)
source
- the annotation set from which to selectcoextAnn
- the annotation from which to take the start and end offsetspublic static String replaceVariablesInString(String string, Object... sources)
Examples:
replaceVariablesInString("text $env{GATE_HOME} more text")
:
returns "text /path/to/gate more text" if the environment variable
"GATE_HOME" was set to "/path/to/gate"
replaceVariablesInString("text $pr{myfeature1} more text",pr1)
:
returns "myvalue1" if the feature map of the processing resource pr1
contains an entry with key "myfeature" and value "myvalue"
replaceVariablesInString("text ${somekey} more text",map1,map2,resource1,map3)
:
this will
find the value of an entry with key "somekey" in the first Map object specified
in the parameter list of the method.
The possible sources for finding values for a variable are:
System.getenv()
: for variables of the form $env{name}
System.getProperties()
: for variables of the form $prop{name}
Resource
: the feature map of any resource which is specified in the
list of objects is used for variables of the form $resource{name} or
for variables of the form $corpus{name} if the resource is a corpus, for
$pr{name} if the resource is a processing resource and so on. If the
resource is a processing resource its
FeatureMap
or Map
: any feature map or
Map which can be used to look up String keys can be specified
as a source and will be used for variables of the form ${name}.
The value substituted is converted to a string using the toString() method of whatever object is stored in the map. If the value returned by Map.get(key) is null, no substitution is carried out and the variable is left unchanged in the string.
The following variable constructs are supported:
If two dollar characters are used instead of one, the replacement string will in turn be subject to replacement, e.g. $$env{abc} could get replaced with the replacement string '$corpus{f1}' which would in turn get replaced with the value of the feature 'f1' from the feature set of the first corpus in the parameter list that has a value for that feature.
@Deprecated public static void loadPlugin(String dirName)
dirName
- The directory name of the plugin within the standard GATE plugins directory.public static void loadPlugin(File pluginDir)
public static AnnotationSet minus(AnnotationSet origSet, Annotation... except)
NOTE: Annotation ids are only unique within a document, so you should never mix annotations from different documents when using this method!
origSet
- The annotation set from which to remove the given annotationexcept
- The annotation to remove from the given setpublic static AnnotationSet minus(AnnotationSet origSet, Collection<Annotation> exceptions)
NOTE: Annotation ids are only unique within a document, so you should never mix annotations from different documents when using this method!
origSet
- The annotation set from which to remove the given exceptionsexceptions
- The annotations to remove from the given setpublic static AnnotationSet plus(AnnotationSet origSet, Annotation... toAdd)
NOTE: Annotation ids are only unique within a document, so you should never mix annotations from different documents when using this method!
origSet
- The annotation set from which to remove the given exceptionstoAdd
- The annotations to add to the given setpublic static AnnotationSet plus(AnnotationSet origSet, Collection<Annotation> toAdd)
NOTE: Annotation ids are only unique within a document, so you should never mix annotations from different documents when using this method!
origSet
- The annotation set from which to remove the given exceptionstoAdd
- A collection of annotations to add to the original setpublic static AnnotationSet intersect(AnnotationSet origSet, Annotation... others)
NOTE: Annotation ids are only unique within a document, so you should never mix annotations from different documents when using this method!
origSet
- The annotation set from which to select only the given annotations.others
- the given annotationspublic static AnnotationSet intersect(AnnotationSet origSet, Collection<Annotation> others)
public static URL resolveURL(String url) throws IOException
IOException
public static URL resolveURL(URL url) throws IOException
IOException
Copyright © 2024 GATE. All rights reserved.