Interface AnnotationIndex<T extends AnnotationFS>

Type Parameters:
T - The top most Java cover class (usually a JCas Class) specified for the underlying index.
All Superinterfaces:
FSIndex<T>, Iterable<T>
All Known Implementing Classes:
AnnotationIndexImpl

public interface AnnotationIndex<T extends AnnotationFS> extends FSIndex<T>
An annotation index provides additional iterator functionality that applies only to instances of uima.tcas.Annotation (or its subtypes). You can obtain an AnnotationIndex by calling:

AnnotationIndex idx = cas.getAnnotationIndex(); or
AnnotationIndex<SomeJCasType> idx = jcas.getAnnotationIndex(SomeJCasType.class);

Note that the AnnotationIndex defines the following sort order between two annotations:

  • Annotations are sorted in increasing order of their start offset. That is, for any annotations a and b, if a.start < b.start then a < b.
  • Annotations whose start offsets are equal are next sorted by decreasing order of their end offsets. That is, if a.start = b.start and a.end > b.end, then a < b. This causes annotations with larger spans to be sorted before annotations with smaller spans, which produces an iteration order similar to a preorder tree traversal.
  • Annotations whose start offsets are equal and whose end offsets are equal are sorted based on TypePriorities (which is an element of the component descriptor). That is, if a.start = b.start, a.end = b.end, and the type of a is defined before the type of b in the type priorities, then a < b.
  • If none of the above rules apply, then the ordering is arbitrary. This will occur if you have two annotations of the exact same type that also have the same span. It will also occur if you have not defined any type priority between two annotations that have the same span.

In the method descriptions below, the notation a < b, where a and b are annotations, should be taken to mean a comes before b in the index, according to the above rules.

  • Method Details

    • iterator

      FSIterator<T> iterator(boolean ambiguous)
      Return an iterator over annotations that can be constrained to be unambiguous.

      A disambiguated iterator is defined as follows. The first annotation returned is the same as would be returned by the corresponding ambiguous iterator. If the unambiguous iterator has returned a previously, it will next return the smallest b s.t. a < b and a.getEnd() <= b.getBegin(). In other words, the b annotation's start will be large enough to not overlap the span of a.

      An unambiguous iterator makes a snapshot copy of the index containing just the disambiguated items, and iterates over that. It doesn't check for concurrent index modifications (the ambiguous iterator does check for this).

      Parameters:
      ambiguous - If set to false, iterator will be unambiguous.
      Returns:
      A annotation iterator.
    • subiterator

      FSIterator<T> subiterator(AnnotationFS annot)
      Return a subiterator whose bounds are defined by the input annotation.

      The subiterator will return annotations b s.t. annot < b, annot.getBegin() <= b.getBegin() and annot.getEnd() >= b.getEnd(). For annotations x, y, x < y here is to be interpreted as "x comes before y in the index", according to the rules defined in the description of this class.

      This definition implies that annotations b that have the same span as annot may or may not be returned by the subiterator. This is determined by the type priorities; the subiterator will only return such an annotation b if the type of annot precedes the type of b in the type priorities definition. If you have not specified the priority, or if annot and b are of the same type, then the behavior is undefined.

      For example, if you an annotation s of type Sentence and an annotation p of type Paragraph that have the same span, and you have defined Paragraph before Sentence in your type priorities, then subiterator(p) will give you an iterator that will return s, but subiterator(s) will give you an iterator that will NOT return p. The intuition is that a Paragraph is conceptually larger than a Sentence, as defined by the type priorities.

      Calling subiterator(a) is equivalent to calling subiterator(a, true, true).. See subiterator(AnnotationFS, boolean, boolean).

      Parameters:
      annot - Defines the boundaries of the subiterator.
      Returns:
      A subiterator.
    • subiterator

      FSIterator<T> subiterator(AnnotationFS annot, boolean ambiguous, boolean strict)
      Return a subiterator whose bounds are defined by the input annotation.

      A strict subiterator is defined as follows: it will return annotations b s.t. annot < b, annot.getBegin() <= b.getBegin() and annot.getEnd() >= b.getEnd(). For annotations x,y, x < y here is to be interpreted as "x comes before y in the index", according to the rules defined in the description of this class.

      If strict is set to false, the boundary conditions are relaxed as follows: return annotations b s.t. annot < b and annot.getBegin() <= b.getBegin() <= annot.getEnd(). The resulting iterator may also be disambiguated.

      These definitions imply that annotations b that have the same span as annot may or may not be returned by the subiterator. This is determined by the type priorities; the subiterator will only return such an annotation b if the type of annot precedes the type of b in the type priorities definition. If you have not specified the priority, or if annot and b are of the same type, then the behavior is undefined.

      For example, if you an annotation s of type Sentence and an annotation p of type Paragraph that have the same span, and you have defined Paragraph before Sentence in your type priorities, then subiterator(p) will give you an iterator that will return s, but subiterator(s) will give you an iterator that will NOT return p. The intuition is that a Paragraph is conceptually larger than a Sentence, as defined by the type priorities.

      Parameters:
      annot - Annotation setting boundary conditions for subiterator.
      ambiguous - If set to false, resulting iterator will be unambiguous.
      strict - Controls if annotations that overlap to the right are considered in or out.
      Returns:
      A subiterator.
    • tree

      AnnotationTree<T> tree(T annot)
      Create an annotation tree with annot as root node. The tree is defined as follows: for each node in the tree, the children are the sequence of annotations that would be obtained from a strict, unambiguous subiterator of the node's annotation.
      Parameters:
      annot - The annotation at the root of the tree. This must be of type T or a subtype
      Returns:
      The annotation tree rooted at annot.