public class StanfordCoreNLP extends AnnotationPipeline
This class is designed to apply multiple Annotators
to an Annotation. The idea is that you first
build up the pipeline by adding Annotators, and then
you take the objects you wish to annotate and pass
them in and get in return a fully annotated object.
At the command-line level you can, e.g., tokenize text with StanfordCoreNLP with a command like:
java edu.stanford.nlp.pipeline.StanfordCoreNLP -annotators tokenize,ssplit -file document.txt
The main entry point for the API is StanfordCoreNLP.process() .
Implementation note: There are other annotation pipelines, but they don't extend this one. Look for classes that implement Annotator and which have "Pipeline" in their name.
Annotator.Requirement| Modifier and Type | Field and Description |
|---|---|
static String |
CUSTOM_ANNOTATOR_PREFIX |
static String |
DEFAULT_NEWLINE_IS_SENTENCE_BREAK |
static String |
DEFAULT_OUTPUT_FORMAT |
static String |
NEWLINE_IS_SENTENCE_BREAK_PROPERTY |
static String |
NEWLINE_SPLITTER_PROPERTY |
TIMEBINARIZED_TREES_REQUIREMENT, CLEAN_XML_REQUIREMENT, DETERMINISTIC_COREF_REQUIREMENT, GENDER_REQUIREMENT, GUTIME_REQUIREMENT, HEIDELTIME_REQUIREMENT, LEMMA_REQUIREMENT, NER_REQUIREMENT, NUMBER_REQUIREMENT, PARSE_AND_TAG, PARSE_REQUIREMENT, PARSE_TAG_BINARIZED_TREES, POS_REQUIREMENT, QUANTIFIABLE_ENTITY_NORMALIZATION_REQUIREMENT, RELATION_EXTRACTOR_REQUIREMENT, SSPLIT_REQUIREMENT, STANFORD_CLEAN_XML, STANFORD_DETERMINISTIC_COREF, STANFORD_GENDER, STANFORD_LEMMA, STANFORD_NER, STANFORD_PARSE, STANFORD_POS, STANFORD_REGEXNER, STANFORD_RELATION, STANFORD_SENTIMENT, STANFORD_SSPLIT, STANFORD_TOKENIZE, STANFORD_TRUECASE, STEM_REQUIREMENT, SUTIME_REQUIREMENT, TIME_WORDS_REQUIREMENT, TOKENIZE_AND_SSPLIT, TOKENIZE_REQUIREMENT, TOKENIZE_SSPLIT_NER, TOKENIZE_SSPLIT_PARSE, TOKENIZE_SSPLIT_PARSE_NER, TOKENIZE_SSPLIT_POS, TOKENIZE_SSPLIT_POS_LEMMA, TRUECASE_REQUIREMENT| Constructor and Description |
|---|
StanfordCoreNLP()
Constructs a pipeline using as properties the properties file found in the classpath
|
StanfordCoreNLP(Properties props)
Construct a basic pipeline.
|
StanfordCoreNLP(Properties props,
boolean enforceRequirements) |
StanfordCoreNLP(String propsFileNamePrefix)
Constructs a pipeline with the properties read from this file, which must be found in the classpath
|
StanfordCoreNLP(String propsFileNamePrefix,
boolean enforceRequirements) |
| Modifier and Type | Method and Description |
|---|---|
void |
annotate(Annotation annotation)
Run the pipeline on an input annotation.
|
static void |
clearAnnotatorPool()
Call this if you are no longer using StanfordCoreNLP and want to
release the memory associated with the annotators.
|
double |
getBeamPrintingOption() |
TreePrint |
getConstituentTreePrinter() |
TreePrint |
getDependencyTreePrinter() |
String |
getEncoding() |
static Annotator |
getExistingAnnotator(String name) |
Properties |
getProperties()
Fetches the Properties object used to construct this Annotator
|
static boolean |
isXMLOutputPresent() |
static void |
main(String[] args)
This can be used just for testing or for command-line text processing.
|
void |
prettyPrint(Annotation annotation,
OutputStream os)
Displays the output of all annotators in a format easily readable by people.
|
void |
prettyPrint(Annotation annotation,
PrintWriter os)
Displays the output of all annotators in a format easily readable by people.
|
Annotation |
process(String text)
Runs the entire pipeline on the content of the given text passed in.
|
void |
processFiles(Collection<File> files) |
void |
processFiles(Collection<File> files,
int numThreads) |
void |
processFiles(String base,
Collection<File> files,
int numThreads) |
String |
timingInformation()
Return a String that gives detailed human-readable information about
how much time was spent by each annotator and by the entire annotation
pipeline.
|
static boolean |
usesBinaryTrees(Properties props)
Determines whether the parser annotator should default to
producing binary trees.
|
void |
xmlPrint(Annotation annotation,
OutputStream os)
Displays the output of all annotators in XML format.
|
void |
xmlPrint(Annotation annotation,
Writer w)
Wrapper around xmlPrint(Annotation, OutputStream).
|
addAnnotator, annotate, annotate, annotate, annotate, getTotalTime, requirementsSatisfied, requirespublic static final String CUSTOM_ANNOTATOR_PREFIX
public static final String NEWLINE_SPLITTER_PROPERTY
public static final String NEWLINE_IS_SENTENCE_BREAK_PROPERTY
public static final String DEFAULT_NEWLINE_IS_SENTENCE_BREAK
public static final String DEFAULT_OUTPUT_FORMAT
public StanfordCoreNLP()
public StanfordCoreNLP(Properties props)
public StanfordCoreNLP(Properties props, boolean enforceRequirements)
public StanfordCoreNLP(String propsFileNamePrefix)
propsFileNamePrefix - public StanfordCoreNLP(String propsFileNamePrefix, boolean enforceRequirements)
public Properties getProperties()
public TreePrint getConstituentTreePrinter()
public TreePrint getDependencyTreePrinter()
public double getBeamPrintingOption()
public String getEncoding()
public static boolean isXMLOutputPresent()
public static void clearAnnotatorPool()
public void annotate(Annotation annotation)
AnnotationPipelineannotate in interface Annotatorannotate in class AnnotationPipelineannotation - The input annotation, usually a raw documentpublic static boolean usesBinaryTrees(Properties props)
public Annotation process(String text)
text - The text to processpublic void prettyPrint(Annotation annotation, OutputStream os)
annotation - Contains the output of all annotatorsos - The output streampublic void prettyPrint(Annotation annotation, PrintWriter os)
annotation - Contains the output of all annotatorsos - The output streampublic void xmlPrint(Annotation annotation, Writer w) throws IOException
annotation - w - The Writer to send the output toIOExceptionpublic void xmlPrint(Annotation annotation, OutputStream os) throws IOException
annotation - Contains the output of all annotatorsos - The output streamIOExceptionpublic String timingInformation()
println().timingInformation in class AnnotationPipelinepublic void processFiles(String base, Collection<File> files, int numThreads) throws IOException
IOExceptionpublic void processFiles(Collection<File> files, int numThreads) throws IOException
IOExceptionpublic void processFiles(Collection<File> files) throws IOException
IOExceptionpublic static void main(String[] args) throws IOException, ClassNotFoundException
Example usage:
java -mx6g edu.stanford.nlp.pipeline.StanfordCoreNLP properties
args - List of required propertiesIOException - If IO problemClassNotFoundException - If class loading problem