edu.stanford.nlp.pipeline
Class WordsToSentencesAnnotator

java.lang.Object
  extended by edu.stanford.nlp.pipeline.WordsToSentencesAnnotator
All Implemented Interfaces:
Annotator

public class WordsToSentencesAnnotator
extends java.lang.Object
implements Annotator

This class assumes that there is either a List<? extends CoreLabel> under the TokensAnnotation field, and runs it through WordToSentenceProcessor and puts the new List<List<? extends CoreLabel>> (it is now definitely a List<List<? extends CoreLabel>>) back under the Annotation.WORDS_KEY field.

Author:
Jenny Finkel

Nested Class Summary
 
Nested classes/interfaces inherited from interface edu.stanford.nlp.pipeline.Annotator
Annotator.Requirement
 
Field Summary
 
Fields inherited from interface edu.stanford.nlp.pipeline.Annotator
CLEAN_XML_REQUIREMENT, DETERMINISTIC_COREF_REQUIREMENT, GENDER_REQUIREMENT, GUTIME_REQUIREMENT, HEIDELTIME_REQUIREMENT, LEMMA_REQUIREMENT, NER_REQUIREMENT, NFL_REQUIREMENT, NFL_TOKENIZE_REQUIREMENT, NUMBER_REQUIREMENT, PARSE_AND_TAG, PARSE_REQUIREMENT, POS_REQUIREMENT, QUANTIFIABLE_ENTITY_NORMALIZATION_REQUIREMENT, SSPLIT_REQUIREMENT, STANFORD_CLEAN_XML, STANFORD_DETERMINISTIC_COREF, STANFORD_GENDER, STANFORD_LEMMA, STANFORD_NER, STANFORD_NFL, STANFORD_NFL_TOKENIZE, STANFORD_PARSE, STANFORD_POS, STANFORD_REGEXNER, STANFORD_SSPLIT, STANFORD_TOKENIZE, STANFORD_TRUECASE, STEM_REQUIREMENT, SUTIME_REQUIREMENT, TIME_WORDS_REQUIREMENT, TOKENIZE_AND_SSPLIT, TOKENIZE_REQUIREMENT, TOKENIZE_SSPLIT_NER, TOKENIZE_SSPLIT_PARSE, TOKENIZE_SSPLIT_PARSE_NER, TOKENIZE_SSPLIT_POS, TOKENIZE_SSPLIT_POS_LEMMA, TRUECASE_REQUIREMENT
 
Constructor Summary
WordsToSentencesAnnotator()
           
WordsToSentencesAnnotator(boolean verbose)
           
WordsToSentencesAnnotator(boolean verbose, java.lang.String boundaryTokenRegex)
           
 
Method Summary
 void addHtmlSentenceBoundaryToDiscard(java.util.Set<java.lang.String> boundaries)
           
 void annotate(Annotation annotation)
          Given an Annotation, perform a task on this Annotation.
static WordsToSentencesAnnotator newlineSplitter(boolean verbose, java.lang.String... nlToken)
           
 java.util.Set<Annotator.Requirement> requirementsSatisfied()
          Returns a set of requirements for which tasks this annotator can provide.
 java.util.Set<Annotator.Requirement> requires()
          Returns the set of tasks which this annotator requires in order to perform.
 void setCountLineNumbers(boolean countLineNumbers)
          If setCountLineNumbers is set to true, we count line numbers by telling the underlying splitter to return empty lists of tokens and then treating those empty lists as empty lines.
 void setOneSentence(boolean isOneSentence)
           
 void setSentenceBoundaryToDiscard(java.util.Set<java.lang.String> boundaries)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

WordsToSentencesAnnotator

public WordsToSentencesAnnotator()

WordsToSentencesAnnotator

public WordsToSentencesAnnotator(boolean verbose)

WordsToSentencesAnnotator

public WordsToSentencesAnnotator(boolean verbose,
                                 java.lang.String boundaryTokenRegex)
Method Detail

newlineSplitter

public static WordsToSentencesAnnotator newlineSplitter(boolean verbose,
                                                        java.lang.String... nlToken)

setSentenceBoundaryToDiscard

public void setSentenceBoundaryToDiscard(java.util.Set<java.lang.String> boundaries)

addHtmlSentenceBoundaryToDiscard

public void addHtmlSentenceBoundaryToDiscard(java.util.Set<java.lang.String> boundaries)

setOneSentence

public void setOneSentence(boolean isOneSentence)

setCountLineNumbers

public void setCountLineNumbers(boolean countLineNumbers)
If setCountLineNumbers is set to true, we count line numbers by telling the underlying splitter to return empty lists of tokens and then treating those empty lists as empty lines. We don't actually include empty sentences in the annotation, though.


annotate

public void annotate(Annotation annotation)
Description copied from interface: Annotator
Given an Annotation, perform a task on this Annotation.

Specified by:
annotate in interface Annotator

requires

public java.util.Set<Annotator.Requirement> requires()
Description copied from interface: Annotator
Returns the set of tasks which this annotator requires in order to perform. For example, the POS annotator will return "tokenize", "ssplit".

Specified by:
requires in interface Annotator

requirementsSatisfied

public java.util.Set<Annotator.Requirement> requirementsSatisfied()
Description copied from interface: Annotator
Returns a set of requirements for which tasks this annotator can provide. For example, the POS annotator will return "pos".

Specified by:
requirementsSatisfied in interface Annotator


Stanford NLP Group