edu.stanford.nlp.pipeline
Interface Annotator

All Known Implementing Classes:
AnnotationPipeline, CharniakParserAnnotator, ChineseSegmenterAnnotator, CleanXmlAnnotator, DeterministicCorefAnnotator, GenderAnnotator, GUTimeAnnotator, HeidelTimeAnnotator, MorphaAnnotator, NERCombinerAnnotator, ParserAnnotator, POSTaggerAnnotator, PTBTokenizerAnnotator, RegexNERAnnotator, StanfordCoreNLP, TimeAnnotator, TokenizerAnnotator, TokensRegexAnnotator, TrueCaseAnnotator, WhitespaceTokenizerAnnotator, WordsToSentencesAnnotator

public interface Annotator

This is an interface for adding annotations to a fully annotated Annotation. In some ways, it is just a glorified Function, except that it explicitly operates on Annotation objects. Annotators should be given to an AnnotationPipeline in order to make annotation pipelines (the whole motivation of this package), and therefore implementers of this interface should be designed to play well with other Annotators and in their javadocs they should explicitly state what annotations they are assuming already exist in the annotation (like parse, POS tag, etc), what field they are expecting them under (Annotation.WORDS_KEY, Annotation.PARSE_KEY, etc) and what annotations they will add (or modify) and the keys for them as well. If you would like to look at the code for a relatively simple Annotator, I recommend NERAnnotator. For a lot of code you could just add the implements directly, but I recommend wrapping instead because I believe that it will help to keep the pipeline code more manageable.
An Annotator can also provide a description of what it produces and a description of what it requires to have been produced by using the Requirement objects. Predefined Requirement objects are provided for most of the core annotators, such as tokenize, ssplit, etc. The StanfordCoreNLP version of the AnnotationPipeline can enforce requirements, throwing an exception if an annotator does not have all of its prerequisite met. An Annotator which does not participate in this system can simply return Collections.emptySet() for both requires() and requirementsSatisfied().

Author:
Jenny Finkel

Nested Class Summary
static class Annotator.Requirement
          The Requirement is a general way of describing the pre and post conditions of an Annotator running.
 
Field Summary
static Annotator.Requirement CLEAN_XML_REQUIREMENT
           
static Annotator.Requirement DETERMINISTIC_COREF_REQUIREMENT
           
static Annotator.Requirement GENDER_REQUIREMENT
           
static Annotator.Requirement GUTIME_REQUIREMENT
          These are annotators which StanfordCoreNLP does not know how to create by itself, meaning you would need to use the custom annotator mechanism to create them.
static Annotator.Requirement HEIDELTIME_REQUIREMENT
           
static Annotator.Requirement LEMMA_REQUIREMENT
           
static Annotator.Requirement NER_REQUIREMENT
           
static Annotator.Requirement NFL_REQUIREMENT
           
static Annotator.Requirement NFL_TOKENIZE_REQUIREMENT
           
static Annotator.Requirement NUMBER_REQUIREMENT
           
static java.util.Set<Annotator.Requirement> PARSE_AND_TAG
           
static Annotator.Requirement PARSE_REQUIREMENT
           
static Annotator.Requirement POS_REQUIREMENT
           
static Annotator.Requirement QUANTIFIABLE_ENTITY_NORMALIZATION_REQUIREMENT
           
static Annotator.Requirement SSPLIT_REQUIREMENT
           
static java.lang.String STANFORD_CLEAN_XML
           
static java.lang.String STANFORD_DETERMINISTIC_COREF
           
static java.lang.String STANFORD_GENDER
           
static java.lang.String STANFORD_LEMMA
           
static java.lang.String STANFORD_NER
           
static java.lang.String STANFORD_NFL
           
static java.lang.String STANFORD_NFL_TOKENIZE
           
static java.lang.String STANFORD_PARSE
           
static java.lang.String STANFORD_POS
           
static java.lang.String STANFORD_REGEXNER
           
static java.lang.String STANFORD_SSPLIT
           
static java.lang.String STANFORD_TOKENIZE
          These are annotators which StanfordCoreNLP knows how to create.
static java.lang.String STANFORD_TRUECASE
           
static Annotator.Requirement STEM_REQUIREMENT
           
static Annotator.Requirement SUTIME_REQUIREMENT
           
static Annotator.Requirement TIME_WORDS_REQUIREMENT
           
static java.util.Set<Annotator.Requirement> TOKENIZE_AND_SSPLIT
          These are typical combinations of annotators which may be used as requirements by other annotators.
static Annotator.Requirement TOKENIZE_REQUIREMENT
           
static java.util.Set<Annotator.Requirement> TOKENIZE_SSPLIT_NER
           
static java.util.Set<Annotator.Requirement> TOKENIZE_SSPLIT_PARSE
           
static java.util.Set<Annotator.Requirement> TOKENIZE_SSPLIT_PARSE_NER
           
static java.util.Set<Annotator.Requirement> TOKENIZE_SSPLIT_POS
           
static java.util.Set<Annotator.Requirement> TOKENIZE_SSPLIT_POS_LEMMA
           
static Annotator.Requirement TRUECASE_REQUIREMENT
           
 
Method Summary
 void annotate(Annotation annotation)
          Given an Annotation, perform a task on this Annotation.
 java.util.Set<Annotator.Requirement> requirementsSatisfied()
          Returns a set of requirements for which tasks this annotator can provide.
 java.util.Set<Annotator.Requirement> requires()
          Returns the set of tasks which this annotator requires in order to perform.
 

Field Detail

STANFORD_TOKENIZE

static final java.lang.String STANFORD_TOKENIZE
These are annotators which StanfordCoreNLP knows how to create. Add new annotators and/or annotators from other groups here!

See Also:
Constant Field Values

STANFORD_CLEAN_XML

static final java.lang.String STANFORD_CLEAN_XML
See Also:
Constant Field Values

STANFORD_SSPLIT

static final java.lang.String STANFORD_SSPLIT
See Also:
Constant Field Values

STANFORD_POS

static final java.lang.String STANFORD_POS
See Also:
Constant Field Values

STANFORD_LEMMA

static final java.lang.String STANFORD_LEMMA
See Also:
Constant Field Values

STANFORD_NER

static final java.lang.String STANFORD_NER
See Also:
Constant Field Values

STANFORD_REGEXNER

static final java.lang.String STANFORD_REGEXNER
See Also:
Constant Field Values

STANFORD_GENDER

static final java.lang.String STANFORD_GENDER
See Also:
Constant Field Values

STANFORD_NFL_TOKENIZE

static final java.lang.String STANFORD_NFL_TOKENIZE
See Also:
Constant Field Values

STANFORD_NFL

static final java.lang.String STANFORD_NFL
See Also:
Constant Field Values

STANFORD_TRUECASE

static final java.lang.String STANFORD_TRUECASE
See Also:
Constant Field Values

STANFORD_PARSE

static final java.lang.String STANFORD_PARSE
See Also:
Constant Field Values

STANFORD_DETERMINISTIC_COREF

static final java.lang.String STANFORD_DETERMINISTIC_COREF
See Also:
Constant Field Values

TOKENIZE_REQUIREMENT

static final Annotator.Requirement TOKENIZE_REQUIREMENT

CLEAN_XML_REQUIREMENT

static final Annotator.Requirement CLEAN_XML_REQUIREMENT

SSPLIT_REQUIREMENT

static final Annotator.Requirement SSPLIT_REQUIREMENT

POS_REQUIREMENT

static final Annotator.Requirement POS_REQUIREMENT

LEMMA_REQUIREMENT

static final Annotator.Requirement LEMMA_REQUIREMENT

NER_REQUIREMENT

static final Annotator.Requirement NER_REQUIREMENT

GENDER_REQUIREMENT

static final Annotator.Requirement GENDER_REQUIREMENT

NFL_TOKENIZE_REQUIREMENT

static final Annotator.Requirement NFL_TOKENIZE_REQUIREMENT

NFL_REQUIREMENT

static final Annotator.Requirement NFL_REQUIREMENT

TRUECASE_REQUIREMENT

static final Annotator.Requirement TRUECASE_REQUIREMENT

PARSE_REQUIREMENT

static final Annotator.Requirement PARSE_REQUIREMENT

DETERMINISTIC_COREF_REQUIREMENT

static final Annotator.Requirement DETERMINISTIC_COREF_REQUIREMENT

GUTIME_REQUIREMENT

static final Annotator.Requirement GUTIME_REQUIREMENT
These are annotators which StanfordCoreNLP does not know how to create by itself, meaning you would need to use the custom annotator mechanism to create them. Note that some of them are already included in other parts of the system, such as sutime, which is already included in ner.


SUTIME_REQUIREMENT

static final Annotator.Requirement SUTIME_REQUIREMENT

HEIDELTIME_REQUIREMENT

static final Annotator.Requirement HEIDELTIME_REQUIREMENT

STEM_REQUIREMENT

static final Annotator.Requirement STEM_REQUIREMENT

NUMBER_REQUIREMENT

static final Annotator.Requirement NUMBER_REQUIREMENT

TIME_WORDS_REQUIREMENT

static final Annotator.Requirement TIME_WORDS_REQUIREMENT

QUANTIFIABLE_ENTITY_NORMALIZATION_REQUIREMENT

static final Annotator.Requirement QUANTIFIABLE_ENTITY_NORMALIZATION_REQUIREMENT

TOKENIZE_AND_SSPLIT

static final java.util.Set<Annotator.Requirement> TOKENIZE_AND_SSPLIT
These are typical combinations of annotators which may be used as requirements by other annotators.


TOKENIZE_SSPLIT_POS

static final java.util.Set<Annotator.Requirement> TOKENIZE_SSPLIT_POS

TOKENIZE_SSPLIT_NER

static final java.util.Set<Annotator.Requirement> TOKENIZE_SSPLIT_NER

TOKENIZE_SSPLIT_PARSE

static final java.util.Set<Annotator.Requirement> TOKENIZE_SSPLIT_PARSE

TOKENIZE_SSPLIT_PARSE_NER

static final java.util.Set<Annotator.Requirement> TOKENIZE_SSPLIT_PARSE_NER

TOKENIZE_SSPLIT_POS_LEMMA

static final java.util.Set<Annotator.Requirement> TOKENIZE_SSPLIT_POS_LEMMA

PARSE_AND_TAG

static final java.util.Set<Annotator.Requirement> PARSE_AND_TAG
Method Detail

annotate

void annotate(Annotation annotation)
Given an Annotation, perform a task on this Annotation.


requirementsSatisfied

java.util.Set<Annotator.Requirement> requirementsSatisfied()
Returns a set of requirements for which tasks this annotator can provide. For example, the POS annotator will return "pos".


requires

java.util.Set<Annotator.Requirement> requires()
Returns the set of tasks which this annotator requires in order to perform. For example, the POS annotator will return "tokenize", "ssplit".



Stanford NLP Group