edu.stanford.nlp.pipeline
Class TokenizerAnnotator

java.lang.Object
  extended by edu.stanford.nlp.pipeline.TokenizerAnnotator
All Implemented Interfaces:
Annotator
Direct Known Subclasses:
PTBTokenizerAnnotator, WhitespaceTokenizerAnnotator

public abstract class TokenizerAnnotator
extends java.lang.Object
implements Annotator

This is an abstract base class for any annotator class that uses a Tokenizer to split TextAnnotation into TokensAnnotation.
The only method the subclass needs to define is a method that produces a Tokenizer of CoreLabels, which is then used to split the TextAnnotation of the given Annotation into CoreLabels.
In order to maintain thread safety, getTokenizer should return a thread-safe tokenizer. In the case of tokenizers built from .flex files, that will mean new tokenizers for each call.

Author:
Jenny Finkel, John Bauer

Constructor Summary
TokenizerAnnotator(boolean verbose)
           
 
Method Summary
 void annotate(Annotation annotation)
          Does the actual work of splitting TextAnnotation into CoreLabels, which are then attached to the TokensAnnotation.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

TokenizerAnnotator

public TokenizerAnnotator(boolean verbose)
Method Detail

annotate

public void annotate(Annotation annotation)
Does the actual work of splitting TextAnnotation into CoreLabels, which are then attached to the TokensAnnotation.

Specified by:
annotate in interface Annotator


Stanford NLP Group