edu.stanford.nlp.ling.tokensregex
Class SequencePattern<T>

java.lang.Object
  extended by edu.stanford.nlp.ling.tokensregex.SequencePattern<T>
Direct Known Subclasses:
TokenSequencePattern

public class SequencePattern<T>
extends java.lang.Object

Generic Sequence Pattern for regular expressions.

Similar to Java's Pattern except it is for sequences over arbitrary types T instead of just characters.

A regular expression must first be compiled into an instance of this class. The resulting pattern can then be used to create a SequenceMatcher object that can match arbitrary sequences of type T against the regular expression. All of the state involved in performing a match resides in the matcher, so many matchers can share the same pattern.

See TokenSequencePattern for example of how this class can be extended to support a specific type T.

To use


   SequencePattern p = SequencePattern.compile("....");
   SequenceMatcher m = p.getMatcher(tokens);
   while (m.find()) ....
 

To support a new type T:

  1. For a type T to be matchable, it has to have a corresponding NodePattern that indicates whether a node is matched or not (see CoreMapNodePattern for example)
  2. To compile a string into corresponding pattern, will need to create a parser (see inner class Parser, TokenSequencePattern and TokenSequenceParser.jj)

SequencePattern supports the following standard regex features:

SequencePattern also supports the following less standard features:

  1. Environment (see Env) with respect to which the patterns are compiled
  2. Binding of variables
    Use Env to bind variables for use when compiling patterns
    Can also bind names to groups (see SequenceMatchResult for accessor methods to retrieve matched groups)
  3. Backreference matches - need to specify how back references are to be matched using SequencePattern.NodesMatchChecker
  4. Multinode matches - for matching of multiple nodes using non-regex (at least not regex over nodes) patterns (need to have corresponding MultiNodePattern, see MultiCoreMapNodePattern for example)
  5. Conjunctions - conjunctions of sequence patterns (works for some cases)

Author:
Angel Chang
See Also:
SequenceMatcher

Nested Class Summary
static class SequencePattern.AndPatternExpr
           
static class SequencePattern.BackRefPatternExpr
           
static class SequencePattern.GroupPatternExpr
           
static class SequencePattern.MultiNodePatternExpr
           
static class SequencePattern.NodePatternExpr
           
protected static interface SequencePattern.NodesMatchChecker<T>
           
static class SequencePattern.OrPatternExpr
           
static interface SequencePattern.Parser<T>
           
static class SequencePattern.PatternExpr
          Represents a sequence pattern expressions (before translating into NFA)
static class SequencePattern.RepeatPatternExpr
           
static class SequencePattern.SequenceEndPatternExpr
           
static class SequencePattern.SequencePatternExpr
           
static class SequencePattern.SequenceStartPatternExpr
           
static class SequencePattern.SpecialNodePatternExpr
           
static class SequencePattern.ValuePatternExpr
           
 
Field Summary
static SequencePattern.PatternExpr ANY_NODE_PATTERN_EXPR
           
protected static edu.stanford.nlp.ling.tokensregex.SequencePattern.State MATCH_STATE
          An accepting matching state
static SequencePattern.NodesMatchChecker<java.lang.Object> NODES_EQUAL_CHECKER
           
static SequencePattern.PatternExpr SEQ_BEGIN_PATTERN_EXPR
           
static SequencePattern.PatternExpr SEQ_END_PATTERN_EXPR
           
 
Constructor Summary
protected SequencePattern(SequencePattern.PatternExpr nodeSequencePattern)
           
protected SequencePattern(java.lang.String patternStr, SequencePattern.PatternExpr nodeSequencePattern)
           
protected SequencePattern(java.lang.String patternStr, SequencePattern.PatternExpr nodeSequencePattern, SequenceMatchAction<T> action)
           
 
Method Summary
static
<T> SequencePattern<T>
compile(Env env, java.lang.String string)
           
protected static
<T> SequencePattern<T>
compile(SequencePattern.PatternExpr nodeSequencePattern)
           
<OUT> OUT
findNodePattern(Function<NodePattern<T>,OUT> filter)
           
 SequenceMatchAction<T> getAction()
           
 SequenceMatcher<T> getMatcher(java.util.List<? extends T> tokens)
           
protected  SequencePattern.PatternExpr getPatternExpr()
           
 double getPriority()
           
 java.lang.String pattern()
           
 void setAction(SequenceMatchAction<T> action)
           
 void setPriority(double priority)
           
 java.lang.String toString()
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

NODES_EQUAL_CHECKER

public static final SequencePattern.NodesMatchChecker<java.lang.Object> NODES_EQUAL_CHECKER

ANY_NODE_PATTERN_EXPR

public static final SequencePattern.PatternExpr ANY_NODE_PATTERN_EXPR

SEQ_BEGIN_PATTERN_EXPR

public static final SequencePattern.PatternExpr SEQ_BEGIN_PATTERN_EXPR

SEQ_END_PATTERN_EXPR

public static final SequencePattern.PatternExpr SEQ_END_PATTERN_EXPR

MATCH_STATE

protected static final edu.stanford.nlp.ling.tokensregex.SequencePattern.State MATCH_STATE
An accepting matching state

Constructor Detail

SequencePattern

protected SequencePattern(SequencePattern.PatternExpr nodeSequencePattern)

SequencePattern

protected SequencePattern(java.lang.String patternStr,
                          SequencePattern.PatternExpr nodeSequencePattern)

SequencePattern

protected SequencePattern(java.lang.String patternStr,
                          SequencePattern.PatternExpr nodeSequencePattern,
                          SequenceMatchAction<T> action)
Method Detail

toString

public java.lang.String toString()
Overrides:
toString in class java.lang.Object

pattern

public java.lang.String pattern()

getPatternExpr

protected SequencePattern.PatternExpr getPatternExpr()

getPriority

public double getPriority()

setPriority

public void setPriority(double priority)

getAction

public SequenceMatchAction<T> getAction()

setAction

public void setAction(SequenceMatchAction<T> action)

compile

public static <T> SequencePattern<T> compile(Env env,
                                             java.lang.String string)

compile

protected static <T> SequencePattern<T> compile(SequencePattern.PatternExpr nodeSequencePattern)

getMatcher

public SequenceMatcher<T> getMatcher(java.util.List<? extends T> tokens)

findNodePattern

public <OUT> OUT findNodePattern(Function<NodePattern<T>,OUT> filter)


Stanford NLP Group