public interface DocumentContent extends Serializable
Modifier and Type | Method and Description |
---|---|
DocumentContent |
getContent(Long start,
Long end)
Return the contents under a particular span.
|
Long |
size()
The size of this content (e.g.
|
DocumentContent getContent(Long start, Long end) throws InvalidOffsetException
Conceptually the annotation offsets are defined as falling in between characters, with "0" pointing before the fist character. Because of that, the offsets where an annotation ends and the space after it starts are the same.
So this is what the "abcde" string looks like with the offsets explicitly included: 0a1b2c3d4e5
"ab cd" would then look like this: 0a1b2 3c4d5
with the following annotations:
Token "ab" [0,2]
SpaceToken " " [2,3]
Token "cd" [3,5]
start
- the beginning index, inclusive.end
- the ending index, exclusive.InvalidOffsetException
- if the
start
is negative, or
end
is larger than the length of
this DocumentContent
object, or
start
is larger than
end
.Long size()
Copyright © 2024 GATE. All rights reserved.