|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||
java.lang.Objectedu.stanford.nlp.ling.tokensregex.SequencePattern<CoreMap>
edu.stanford.nlp.ling.tokensregex.TokenSequencePattern
public class TokenSequencePattern
Token Sequence Pattern for regular expressions for sequences over tokens (as the more general CoreMap).
Sequences over tokens can be matched like strings.
To use
TokenSequencePattern p = TokenSequencePattern.compile("....");
TokenSequenceMatcher m = p.getMatcher(tokens);
while (m.find()) ....
Supports the following:
X YX | YX & Y(X) (with numeric group id)(?$var X) (with group name "$var")(?:X)m.group()) or list of tokens (m.groupNodes()).
m.group(id) or m.groupNodes(id)
m.group("$var") or m.groupNodes("$var")
SequenceMatchResult for more accessor functions to retrieve matches.
X+, X?, X*, X{n,m}, X{n}, X{n,}X+?, X??, X*?, X{n,m}?, X{n}?, X{n,}?\captureid [pattern] => [value].
Value for matched expression can be accessed using m.groupValue()
( one => 1 | two => 2 | three => 3 | ...)
Individual tokens are marked by "[" TOKEN_EXPR "]"
Possible TOKEN_EXPR:
{ lemma:/.../; tag:"NNP" } = attributes that need to all match
/.../ used for regular expressions,
"..." for exact string matches
{ word>=2 }
">=", "<=", ">", "<", or "=="
{ word::IS_NUM } , { word::IS_NIL } or
{ word::NOT_EXISTS }, { word::NOT_NIL } or { word::EXISTS }
/.../ or "..."
!{...}
{...} & {...} or {...} | {...}
Special tokens:
Any token: []
String pattern match across multiple tokens:
(?m){min,max} /pattern/
Binding of variables for use in compiling patterns:
// Bind string for later compilation using: compile("/it/ /was/ $RELDAY");
env.bind("$RELDAY", "/today|yesterday|tomorrow|tonight|tonite/");
// Bind pre-compiled patter for later compilation using: compile("/it/ /was/ $RELDAY");
env.bind("$RELDAY", TokenSequencePattern.compile(env, "/today|yesterday|tomorrow|tonight|tonite/"));
// Bind node pattern so we can do patterns like: compile("... temporal::IS_TIMEX_DATE ...");
// (TimexTypeMatchNodePattern is a NodePattern that implements some custom logic)
env.bind("::IS_TIMEX_DATE", new TimexTypeMatchNodePattern(SUTime.TimexType.DATE));
Actions (partially implemented)
pattern ==> action &annotate( { ner="DATE" } ) pattern.getAction().apply(match, groupid)
TokenSequenceMatcher| Nested Class Summary |
|---|
| Field Summary | |
|---|---|
static TokenSequencePattern |
ANY_NODE_PATTERN
|
| Fields inherited from class edu.stanford.nlp.ling.tokensregex.SequencePattern |
|---|
ANY_NODE_PATTERN_EXPR, MATCH_STATE, NODES_EQUAL_CHECKER, SEQ_BEGIN_PATTERN_EXPR, SEQ_END_PATTERN_EXPR |
| Constructor Summary | |
|---|---|
TokenSequencePattern(java.lang.String patternStr,
SequencePattern.PatternExpr nodeSequencePattern)
|
|
TokenSequencePattern(java.lang.String patternStr,
SequencePattern.PatternExpr nodeSequencePattern,
SequenceMatchAction<CoreMap> action)
|
|
| Method Summary | |
|---|---|
static TokenSequencePattern |
compile(Env env,
java.lang.String... strings)
Compiles a sequence of regular expression a TokenSequencePattern using the specified environment |
static TokenSequencePattern |
compile(Env env,
java.lang.String string)
Compiles a regular expression over tokens into a TokenSequencePattern using the specified environment |
static TokenSequencePattern |
compile(SequencePattern.PatternExpr nodeSequencePattern)
|
static TokenSequencePattern |
compile(java.lang.String... strings)
Compiles a sequence of regular expression a TokenSequencePattern using the default environment |
static TokenSequencePattern |
compile(java.lang.String string)
Compiles a regular expression over tokens into a TokenSequencePattern using the default environment |
TokenSequenceMatcher |
getMatcher(java.util.List<? extends CoreMap> tokens)
Returns a TokenSequenceMatcher that can be used to match this pattern against the specified list of tokens |
static MultiPatternMatcher<CoreMap> |
getMultiPatternMatcher(java.util.Collection<TokenSequencePattern> patterns)
|
static MultiPatternMatcher<CoreMap> |
getMultiPatternMatcher(TokenSequencePattern... patterns)
|
static Env |
getNewEnv()
|
java.lang.String |
toString()
|
| Methods inherited from class edu.stanford.nlp.ling.tokensregex.SequencePattern |
|---|
findNodePattern, getAction, getPatternExpr, getPriority, pattern, setAction, setPriority |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
| Field Detail |
|---|
public static final TokenSequencePattern ANY_NODE_PATTERN
| Constructor Detail |
|---|
public TokenSequencePattern(java.lang.String patternStr,
SequencePattern.PatternExpr nodeSequencePattern)
public TokenSequencePattern(java.lang.String patternStr,
SequencePattern.PatternExpr nodeSequencePattern,
SequenceMatchAction<CoreMap> action)
| Method Detail |
|---|
public static Env getNewEnv()
public static TokenSequencePattern compile(java.lang.String string)
string - Regular expression to be compiled
public static TokenSequencePattern compile(Env env,
java.lang.String string)
env - Environment to usestring - Regular expression to be compiled
public static TokenSequencePattern compile(java.lang.String... strings)
strings - List of regular expression to be compiled
public static TokenSequencePattern compile(Env env,
java.lang.String... strings)
env - Environment to usestrings - List of regular expression to be compiled
public static TokenSequencePattern compile(SequencePattern.PatternExpr nodeSequencePattern)
public TokenSequenceMatcher getMatcher(java.util.List<? extends CoreMap> tokens)
getMatcher in class SequencePattern<CoreMap>tokens - List of tokens to match against
public java.lang.String toString()
toString in class SequencePattern<CoreMap>public static MultiPatternMatcher<CoreMap> getMultiPatternMatcher(java.util.Collection<TokenSequencePattern> patterns)
public static MultiPatternMatcher<CoreMap> getMultiPatternMatcher(TokenSequencePattern... patterns)
|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||