|
|||||||||
| PREV PACKAGE NEXT PACKAGE | FRAMES NO FRAMES | ||||||||
| Interface Summary | |
|---|---|
| CoreTokenFactory<IN extends CoreMap> | To make tokens like CoreMap or CoreLabel. |
| LexedTokenFactory<T> | Constructs a token (of arbitrary type) from a String and its position in the underlying text. |
| ListProcessor<IN,OUT> | An interface for things that operate on a List. |
| SerializableFunction<T1,T2> | This interface is a conjunction of Function and Serializable, which is a bad idea from the perspective of the type system, but one that seems more palatable than other bad ideas until java's type system is flexible enough to support type conjunctions. |
| Tokenizer<T> | Tokenizers break up text into individual Objects. |
| WordSegmenter | An interface for segmenting strings into words (in unwordsegmented languages). |
| Class Summary | |
|---|---|
| AbstractTokenizer<T> | An abstract tokenizer. |
| Americanize | Takes a HasWord or String and returns an Americanized version of it. |
| ChineseDocumentToSentenceProcessor | Convert a Chinese Document into a List of sentence Strings. |
| CoreLabelTokenFactory | Constructs CoreLabels from Strings optionally with
beginning and ending (character after the end) offset positions in
an original text. |
| Morphology | Morphology computes the base form of English words, by removing just inflections (not derivational morphology). |
| PTBTokenizer<T extends HasWord> | Fast, rule-based tokenizer implementation, initially written to conform to the Penn Treebank tokenization conventions, but now providing a range of tokenization options over a broader space of Unicode text. |
| PTBTokenizer.PTBTokenizerFactory<T extends HasWord> | This class provides a factory which will vend instances of PTBTokenizer which wrap a provided Reader. |
| TokenizerAdapter | This class adapts between a java.io.StreamTokenizer
and a edu.stanford.nlp.process.Tokenizer. |
| WhitespaceTokenizer<T extends HasWord> | A WhitespaceTokenizer is a tokenizer that splits on and discards only whitespace characters. |
| WhitespaceTokenizer.WhitespaceTokenizerFactory<T extends HasWord> | A factory which vends WhitespaceTokenizers. |
| WordSegmentingTokenizer | A tokenizer that works by calling a WordSegmenter. |
| WordShapeClassifier | Provides static methods which map any String to another String indicative of its "word shape" -- e.g., whether capitalized, numeric, etc. |
| WordTokenFactory | Constructs a Word from a String. |
| WordToSentenceProcessor<IN> | Transforms a Document of Words into a Document of Sentences by grouping the Words. |
|
|||||||||
| PREV PACKAGE NEXT PACKAGE | FRAMES NO FRAMES | ||||||||