edu.stanford.nlp.pipeline
Class AnnotationPipeline

java.lang.Object
  extended by edu.stanford.nlp.pipeline.AnnotationPipeline
All Implemented Interfaces:
Annotator
Direct Known Subclasses:
StanfordCoreNLP

public class AnnotationPipeline
extends java.lang.Object
implements Annotator

This class is designed to apply multiple Annotators to an Annotation. The idea is that you first build up the pipeline by adding Annotators, and then you takes the objects you wish to annotate and pass them in and get in return a fully annotated object. Please see package level javadocs for sample usage and a more complete description.

Author:
Jenny Finkel

Nested Class Summary
 
Nested classes/interfaces inherited from interface edu.stanford.nlp.pipeline.Annotator
Annotator.Requirement
 
Field Summary
protected static boolean TIME
           
 
Fields inherited from interface edu.stanford.nlp.pipeline.Annotator
CLEAN_XML_REQUIREMENT, DETERMINISTIC_COREF_REQUIREMENT, GENDER_REQUIREMENT, GUTIME_REQUIREMENT, HEIDELTIME_REQUIREMENT, LEMMA_REQUIREMENT, NER_REQUIREMENT, NFL_REQUIREMENT, NFL_TOKENIZE_REQUIREMENT, NUMBER_REQUIREMENT, PARSE_AND_TAG, PARSE_REQUIREMENT, POS_REQUIREMENT, QUANTIFIABLE_ENTITY_NORMALIZATION_REQUIREMENT, SSPLIT_REQUIREMENT, STANFORD_CLEAN_XML, STANFORD_DETERMINISTIC_COREF, STANFORD_GENDER, STANFORD_LEMMA, STANFORD_NER, STANFORD_NFL, STANFORD_NFL_TOKENIZE, STANFORD_PARSE, STANFORD_POS, STANFORD_REGEXNER, STANFORD_SSPLIT, STANFORD_TOKENIZE, STANFORD_TRUECASE, STEM_REQUIREMENT, SUTIME_REQUIREMENT, TIME_WORDS_REQUIREMENT, TOKENIZE_AND_SSPLIT, TOKENIZE_REQUIREMENT, TOKENIZE_SSPLIT_NER, TOKENIZE_SSPLIT_PARSE, TOKENIZE_SSPLIT_PARSE_NER, TOKENIZE_SSPLIT_POS, TOKENIZE_SSPLIT_POS_LEMMA, TRUECASE_REQUIREMENT
 
Constructor Summary
AnnotationPipeline()
           
AnnotationPipeline(java.util.List<Annotator> annotators)
           
 
Method Summary
 void addAnnotator(Annotator annotator)
           
 void annotate(Annotation annotation)
          Run the pipeline on an input annotation.
 void annotate(java.lang.Iterable<Annotation> annotations)
          Annotate a collection of input annotations IN PARALLEL, making use of all available cores.
 void annotate(java.lang.Iterable<Annotation> annotations, Function<Annotation,java.lang.Object> callback)
          Annotate a collection of input annotations IN PARALLEL, making use of all available cores
 void annotate(java.lang.Iterable<Annotation> annotations, int numThreads)
          Annotate a collection of input annotations IN PARALLEL, making use of threads given in numThreads
 void annotate(java.lang.Iterable<Annotation> annotations, int numThreads, Function<Annotation,java.lang.Object> callback)
          Annotate a collection of input annotations IN PARALLEL, making use of threads given in numThreads
protected  long getTotalTime()
          Return the total pipeline annotation time in milliseconds.
static void main(java.lang.String[] args)
           
 java.util.Set<Annotator.Requirement> requirementsSatisfied()
          Returns a set of requirements for which tasks this annotator can provide.
 java.util.Set<Annotator.Requirement> requires()
          Returns the set of tasks which this annotator requires in order to perform.
 java.lang.String timingInformation()
          Return a String that gives detailed human-readable information about how much time was spent by each annotator and by the entire annotation pipeline.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

TIME

protected static final boolean TIME
See Also:
Constant Field Values
Constructor Detail

AnnotationPipeline

public AnnotationPipeline(java.util.List<Annotator> annotators)

AnnotationPipeline

public AnnotationPipeline()
Method Detail

addAnnotator

public void addAnnotator(Annotator annotator)

annotate

public void annotate(Annotation annotation)
Run the pipeline on an input annotation. The annotation is modified in place

Specified by:
annotate in interface Annotator
Parameters:
annotation - The input annotation, usually a raw document

annotate

public void annotate(java.lang.Iterable<Annotation> annotations)
Annotate a collection of input annotations IN PARALLEL, making use of all available cores.

Parameters:
annotations - The input annotations to process

annotate

public void annotate(java.lang.Iterable<Annotation> annotations,
                     Function<Annotation,java.lang.Object> callback)
Annotate a collection of input annotations IN PARALLEL, making use of all available cores

Parameters:
annotations - The input annotations to process
callback - A function to be called when an annotation finishes. The return value of the callback is ignored

annotate

public void annotate(java.lang.Iterable<Annotation> annotations,
                     int numThreads)
Annotate a collection of input annotations IN PARALLEL, making use of threads given in numThreads

Parameters:
annotations - The input annotations to process
numThreads - The number of threads to run on

annotate

public void annotate(java.lang.Iterable<Annotation> annotations,
                     int numThreads,
                     Function<Annotation,java.lang.Object> callback)
Annotate a collection of input annotations IN PARALLEL, making use of threads given in numThreads

Parameters:
annotations - The input annotations to process
numThreads - The number of threads to run on
callback - A function to be called when an annotation finishes. The return value of the callback is ignored.

getTotalTime

protected long getTotalTime()
Return the total pipeline annotation time in milliseconds.

Returns:
The total pipeline annotation time in milliseconds

timingInformation

public java.lang.String timingInformation()
Return a String that gives detailed human-readable information about how much time was spent by each annotator and by the entire annotation pipeline. This String includes newline characters but does not end with one, and so it is suitable to be printed out with a println().

Returns:
Human readable information on time spent in processing.

requirementsSatisfied

public java.util.Set<Annotator.Requirement> requirementsSatisfied()
Description copied from interface: Annotator
Returns a set of requirements for which tasks this annotator can provide. For example, the POS annotator will return "pos".

Specified by:
requirementsSatisfied in interface Annotator

requires

public java.util.Set<Annotator.Requirement> requires()
Description copied from interface: Annotator
Returns the set of tasks which this annotator requires in order to perform. For example, the POS annotator will return "tokenize", "ssplit".

Specified by:
requires in interface Annotator

main

public static void main(java.lang.String[] args)
                 throws java.io.IOException,
                        java.lang.ClassNotFoundException
Throws:
java.io.IOException
java.lang.ClassNotFoundException


Stanford NLP Group