edu.stanford.nlp.ling.tokensregex
Class CoreMapExpressionExtractor<T extends MatchedExpression>

java.lang.Object
  extended by edu.stanford.nlp.ling.tokensregex.CoreMapExpressionExtractor<T>

public class CoreMapExpressionExtractor<T extends MatchedExpression>
extends java.lang.Object

Represents a list of assignment and extraction rules over sequence patterns. See SequenceMatchRules for syntax of rules.

Assignment rules are used to assign value to variable for later use in extraction rules or for expansions in patterns.

Extraction rules are used to extract text/tokens matching regular expressions. Extraction rules are grouped into stages, with each stage consisting of the following.

  1. Matching of rules over text and tokens. These rules are applied directly on the text and tokens fields of the CoreMap
  2. Matching of composite rules. Matched expression are merged, and composite rules are applied recursively until no more changes to the matched expressions are detected.
  3. Filtering of invalid expression. In the final phase, a final filtering stage filters out invalid expressions.
The different stages are numbered and are applied in numeric order.

Author:
Angel Chang
See Also:
SequenceMatchRules

Nested Class Summary
static class CoreMapExpressionExtractor.Stage<T>
          Describes one stage of extraction
 
Constructor Summary
CoreMapExpressionExtractor()
          Creates an empty instance with no rules
CoreMapExpressionExtractor(Env env)
          Creates a default instance with the specified environment.
CoreMapExpressionExtractor(Env env, java.util.List<SequenceMatchRules.Rule> rules)
          Creates an instance with the specified environment and list of rules
 
Method Summary
 void appendRules(java.util.List<SequenceMatchRules.Rule> rules)
          Add specified rules to this extractor
 Pair<java.util.List<? extends CoreMap>,java.util.List<T>> applyCompositeRule(SequenceMatchRules.ExtractRule<java.util.List<? extends CoreMap>,T> compositeExtractRule, java.util.List<? extends CoreMap> merged, java.util.List<T> matchedExpressions, int limit)
           
static CoreMapExpressionExtractor createExtractorFromFile(Env env, java.lang.String filename)
          Creates an extractor using the specified environment, and reading the rules from the given filename
static CoreMapExpressionExtractor createExtractorFromFiles(Env env, java.util.List<java.lang.String> filenames)
          Creates an extractor using the specified environment, and reading the rules from the given filenames
static CoreMapExpressionExtractor createExtractorFromFiles(Env env, java.lang.String... filenames)
          Creates an extractor using the specified environment, and reading the rules from the given filenames
static CoreMapExpressionExtractor createExtractorFromString(Env env, java.lang.String str)
          Creates an extractor using the specified environment, and reading the rules from the given string
 java.util.List<CoreMap> extractCoreMaps(CoreMap annotation)
          Returns list of coremaps that matches the specified rules
 java.util.List<CoreMap> extractCoreMapsMergedWithTokens(CoreMap annotation)
          Returns list of merged tokens and original tokens
 java.util.List<CoreMap> extractCoreMapsToList(java.util.List<CoreMap> res, CoreMap annotation)
           
 java.util.List<T> extractExpressions(CoreMap annotation)
           
 java.util.List<CoreMap> flatten(java.util.List<CoreMap> cms)
           
 java.util.List<CoreMap> flatten(java.util.List<CoreMap> cms, java.lang.Class key)
           
 Env getEnv()
           
 Value getValue(java.lang.String varname)
           
 void setExtractRules(SequenceMatchRules.ExtractRule<CoreMap,T> basicExtractRule, SequenceMatchRules.ExtractRule<java.util.List<? extends CoreMap>,T> compositeExtractRule, Filter<T> filterRule)
           
 void setLogger(java.util.logging.Logger logger)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

CoreMapExpressionExtractor

public CoreMapExpressionExtractor()
Creates an empty instance with no rules


CoreMapExpressionExtractor

public CoreMapExpressionExtractor(Env env)
Creates a default instance with the specified environment. (use the default tokens annotation key as specified in the environment)

Parameters:
env - Environment to use for binding variables and applying rules

CoreMapExpressionExtractor

public CoreMapExpressionExtractor(Env env,
                                  java.util.List<SequenceMatchRules.Rule> rules)
Creates an instance with the specified environment and list of rules

Parameters:
env - Environment to use for binding variables and applying rules
rules - List of rules for this extractor
Method Detail

appendRules

public void appendRules(java.util.List<SequenceMatchRules.Rule> rules)
Add specified rules to this extractor

Parameters:
rules -

getEnv

public Env getEnv()

setLogger

public void setLogger(java.util.logging.Logger logger)

setExtractRules

public void setExtractRules(SequenceMatchRules.ExtractRule<CoreMap,T> basicExtractRule,
                            SequenceMatchRules.ExtractRule<java.util.List<? extends CoreMap>,T> compositeExtractRule,
                            Filter<T> filterRule)

createExtractorFromFiles

public static CoreMapExpressionExtractor createExtractorFromFiles(Env env,
                                                                  java.lang.String... filenames)
                                                           throws java.lang.RuntimeException
Creates an extractor using the specified environment, and reading the rules from the given filenames

Parameters:
env -
filenames -
Throws:
java.lang.RuntimeException

createExtractorFromFiles

public static CoreMapExpressionExtractor createExtractorFromFiles(Env env,
                                                                  java.util.List<java.lang.String> filenames)
                                                           throws java.lang.RuntimeException
Creates an extractor using the specified environment, and reading the rules from the given filenames

Parameters:
env -
filenames -
Throws:
java.lang.RuntimeException

createExtractorFromFile

public static CoreMapExpressionExtractor createExtractorFromFile(Env env,
                                                                 java.lang.String filename)
                                                          throws java.lang.RuntimeException
Creates an extractor using the specified environment, and reading the rules from the given filename

Parameters:
env -
filename -
Throws:
java.lang.RuntimeException

createExtractorFromString

public static CoreMapExpressionExtractor createExtractorFromString(Env env,
                                                                   java.lang.String str)
                                                            throws java.io.IOException,
                                                                   ParseException
Creates an extractor using the specified environment, and reading the rules from the given string

Parameters:
env -
str -
Throws:
IOException, - ParseException
java.io.IOException
ParseException

getValue

public Value getValue(java.lang.String varname)

extractCoreMapsToList

public java.util.List<CoreMap> extractCoreMapsToList(java.util.List<CoreMap> res,
                                                     CoreMap annotation)

extractCoreMaps

public java.util.List<CoreMap> extractCoreMaps(CoreMap annotation)
Returns list of coremaps that matches the specified rules

Parameters:
annotation -

extractCoreMapsMergedWithTokens

public java.util.List<CoreMap> extractCoreMapsMergedWithTokens(CoreMap annotation)
Returns list of merged tokens and original tokens

Parameters:
annotation -

flatten

public java.util.List<CoreMap> flatten(java.util.List<CoreMap> cms)

flatten

public java.util.List<CoreMap> flatten(java.util.List<CoreMap> cms,
                                       java.lang.Class key)

applyCompositeRule

public Pair<java.util.List<? extends CoreMap>,java.util.List<T>> applyCompositeRule(SequenceMatchRules.ExtractRule<java.util.List<? extends CoreMap>,T> compositeExtractRule,
                                                                                    java.util.List<? extends CoreMap> merged,
                                                                                    java.util.List<T> matchedExpressions,
                                                                                    int limit)

extractExpressions

public java.util.List<T> extractExpressions(CoreMap annotation)


Stanford NLP Group