edu.stanford.nlp.ling.tokensregex
Class SequenceMatcher<T>

java.lang.Object
  extended by edu.stanford.nlp.ling.tokensregex.BasicSequenceMatchResult<T>
      extended by edu.stanford.nlp.ling.tokensregex.SequenceMatcher<T>
All Implemented Interfaces:
SequenceMatchResult<T>, HasInterval<java.lang.Integer>, java.util.regex.MatchResult
Direct Known Subclasses:
CoreMapSequenceMatcher

public class SequenceMatcher<T>
extends BasicSequenceMatchResult<T>

Generic sequence matcher

Similar to Java's Matcher except it matches sequences over an arbitrary type T instead of characters For a type T to be matchable, it has to have a corresponding NodePattern that indicates whether a node is matched or not

A matcher is created as follows:


   SequencePattern p = SequencePattern.compile("...");
   SequencePattern m = p.getMatcher(List sequence);
 

Functions for searching


    boolean matches()
    boolean find()
    boolean find(int start)
 
Functions for retrieving matched patterns

    int groupCount()
    List groupNodes(), List groupNodes(int g)
    String group(), String group(int g)
    int start(), int start(int g), int end(), int end(int g)
 
Functions for defining the region of the sequence to search over (default region is entire sequence)

     void region(int start, int end)
     int regionStart()
     int regionEnd()
 

NOTE: When find is used, matches are attempted starting from the specified start index of the sequence The match with the earliest starting index is returned.

Author:
Angel Chang

Nested Class Summary
static class SequenceMatcher.BasicMatchReplacement<T>
          Replacement item is a sequence of items
static class SequenceMatcher.FindType
          Type of search to perform FIND_NONOVERLAPPING - Find nonoverlapping matches (default) FIND_ALL - Find all potential matches Greedy/reluctant quantifiers are not enforced (perhaps should add syntax where some of them are enforced...)
static class SequenceMatcher.GroupMatchReplacement<T>
          Replacement item is a matched group specified with a group id
static interface SequenceMatcher.MatchReplacement<T>
          Interface that specifies what to replace a matched pattern with
static class SequenceMatcher.NamedGroupMatchReplacement<T>
          Replacement item is a matched group specified with a group name
 
Nested classes/interfaces inherited from class edu.stanford.nlp.ling.tokensregex.BasicSequenceMatchResult
BasicSequenceMatchResult.MatchedGroup
 
Nested classes/interfaces inherited from interface edu.stanford.nlp.ling.tokensregex.SequenceMatchResult
SequenceMatchResult.GroupToIntervalFunc<MR extends java.util.regex.MatchResult>, SequenceMatchResult.MatchedGroupInfo<T>
 
Field Summary
 
Fields inherited from interface edu.stanford.nlp.ling.tokensregex.SequenceMatchResult
DEFAULT_COMPARATOR, GROUP_AFTER_MATCH, GROUP_BEFORE_MATCH, LENGTH_COMPARATOR, OFFSET_COMPARATOR, ORDER_COMPARATOR, SCORE_COMPARATOR, SCORE_LENGTH_ORDER_OFFSET_COMPARATOR, TO_INTERVAL
 
Fields inherited from interface edu.stanford.nlp.util.HasInterval
CONTAINS_FIRST_ENDPOINTS_COMPARATOR, ENDPOINTS_COMPARATOR, NESTED_FIRST_ENDPOINTS_COMPARATOR
 
Constructor Summary
protected SequenceMatcher(SequencePattern pattern, java.util.List<? extends T> elements)
           
 
Method Summary
 int end(int group)
           
 boolean find()
          Searches for the next occurrence of the pattern
 boolean find(int start)
          Reset the matcher and then searches for pattern at the specified start index
protected  boolean find(int start, boolean matchStart)
           
protected  boolean findMatchStart(int start, boolean matchAllTokens)
           
protected  boolean findMatchStartBacktracking(int start, boolean matchAllTokens)
           
protected  boolean findMatchStartNoBacktracking(int start, boolean matchAllTokens)
           
 T get(int i)
          Returns the ith element
 SequenceMatcher.FindType getFindType()
           
 SequenceMatchResult.MatchedGroupInfo<T> groupInfo(int group)
           
 java.lang.Object groupMatchResult(int group, int index)
          Returns an Object representing the result for the match for a particular node in a group.
 java.util.List<java.lang.Object> groupMatchResults(int group)
          Returns a list of Objects representing the match results for the nodes in the group.
 java.util.List<? extends T> groupNodes(int group)
          Returns the matched group as a list.
 java.lang.Object groupValue(int group)
           
 boolean isMatchWithResult()
           
 boolean matches()
          Checkes if the pattern matches the entire sequence
 java.lang.Object nodeMatchResult(int index)
          Returns an Object representing the result for the match for a particular node.
 void region(int start, int end)
          Set region to search in
 int regionEnd()
           
 int regionStart()
           
 java.util.List<T> replaceAll(java.util.List<T> replacement)
          Replaces all occurrences of the pattern with the specified list.
 java.util.List<T> replaceAllExtended(java.util.List<SequenceMatcher.MatchReplacement<T>> replacement)
          Replaces all occurrences of the pattern with the specified list of replacement items (can include matched groups).
 java.util.List<T> replaceFirst(java.util.List<T> replacement)
          Replaces the first occurrence of the pattern with the specified list.
 java.util.List<T> replaceFirstExtended(java.util.List<SequenceMatcher.MatchReplacement<T>> replacement)
          Replaces the first occurrence of the pattern with the specified list of replacement items (can include matched groups).
 void reset()
          Clears matcher - Clears matched groups, reset region to be entire sequence
 void setFindType(SequenceMatcher.FindType findType)
           
 void setMatchWithResult(boolean matchWithResult)
           
 int start(int group)
           
 BasicSequenceMatchResult<T> toBasicSequenceMatchResult()
           
 
Methods inherited from class edu.stanford.nlp.ling.tokensregex.BasicSequenceMatchResult
copy, elements, end, end, getInterval, getOrder, group, group, group, groupCount, groupInfo, groupInfo, groupMatchResult, groupMatchResults, groupMatchResults, groupNodes, groupNodes, groupValue, groupValue, score, setOrder, start, start, toBasicSequenceMatchResult
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

SequenceMatcher

protected SequenceMatcher(SequencePattern pattern,
                          java.util.List<? extends T> elements)
Method Detail

replaceAllExtended

public java.util.List<T> replaceAllExtended(java.util.List<SequenceMatcher.MatchReplacement<T>> replacement)
Replaces all occurrences of the pattern with the specified list of replacement items (can include matched groups).

Parameters:
replacement - What to replace the matched sequence with
Returns:
New list with all occurrences of the pattern replaced
See Also:
replaceFirst(java.util.List), replaceFirstExtended(java.util.List), replaceAllExtended(java.util.List)

replaceFirstExtended

public java.util.List<T> replaceFirstExtended(java.util.List<SequenceMatcher.MatchReplacement<T>> replacement)
Replaces the first occurrence of the pattern with the specified list of replacement items (can include matched groups).

Parameters:
replacement - What to replace the matched sequence with
Returns:
New list with the first occurrence of the pattern replaced
See Also:
replaceFirst(java.util.List), replaceAll(java.util.List), replaceAllExtended(java.util.List)

replaceAll

public java.util.List<T> replaceAll(java.util.List<T> replacement)
Replaces all occurrences of the pattern with the specified list. Use replaceAllExtended(java.util.List) to replace with matched groups.

Parameters:
replacement - What to replace the matched sequence with
Returns:
New list with all occurrences of the pattern replaced
See Also:
replaceAllExtended(java.util.List), replaceFirst(java.util.List), replaceFirstExtended(java.util.List)

replaceFirst

public java.util.List<T> replaceFirst(java.util.List<T> replacement)
Replaces the first occurrence of the pattern with the specified list. Use replaceFirstExtended(java.util.List) to replace with matched groups.

Parameters:
replacement - What to replace the matched sequence with
Returns:
New list with the first occurrence of the pattern replaced
See Also:
replaceAll(java.util.List), replaceAllExtended(java.util.List), replaceFirstExtended(java.util.List)

getFindType

public SequenceMatcher.FindType getFindType()

setFindType

public void setFindType(SequenceMatcher.FindType findType)

isMatchWithResult

public boolean isMatchWithResult()

setMatchWithResult

public void setMatchWithResult(boolean matchWithResult)

find

public boolean find(int start)
Reset the matcher and then searches for pattern at the specified start index

Parameters:
start - - Index at which to start the search
Returns:
true if a match is found (false otherwise)
Throws:
java.lang.IndexOutOfBoundsException - if start is < 0 or larger then the size of the sequence
See Also:
find()

find

protected boolean find(int start,
                       boolean matchStart)

find

public boolean find()
Searches for the next occurrence of the pattern

Returns:
true if a match is found (false otherwise)
See Also:
find(int)

findMatchStart

protected boolean findMatchStart(int start,
                                 boolean matchAllTokens)

findMatchStartNoBacktracking

protected boolean findMatchStartNoBacktracking(int start,
                                               boolean matchAllTokens)

findMatchStartBacktracking

protected boolean findMatchStartBacktracking(int start,
                                             boolean matchAllTokens)

matches

public boolean matches()
Checkes if the pattern matches the entire sequence

Returns:
true if the entire sequence is matched (false otherwise)
See Also:
find()

region

public void region(int start,
                   int end)
Set region to search in

Parameters:
start - - start index
end - - end index (exclusive)

regionEnd

public int regionEnd()

regionStart

public int regionStart()

toBasicSequenceMatchResult

public BasicSequenceMatchResult<T> toBasicSequenceMatchResult()
Specified by:
toBasicSequenceMatchResult in interface SequenceMatchResult<T>
Overrides:
toBasicSequenceMatchResult in class BasicSequenceMatchResult<T>

start

public int start(int group)
Specified by:
start in interface java.util.regex.MatchResult
Overrides:
start in class BasicSequenceMatchResult<T>

end

public int end(int group)
Specified by:
end in interface java.util.regex.MatchResult
Overrides:
end in class BasicSequenceMatchResult<T>

groupNodes

public java.util.List<? extends T> groupNodes(int group)
Description copied from interface: SequenceMatchResult
Returns the matched group as a list.

Specified by:
groupNodes in interface SequenceMatchResult<T>
Overrides:
groupNodes in class BasicSequenceMatchResult<T>
Parameters:
group - The index of a capturing group in this matcher's pattern
Returns:
the matched group as a list

groupValue

public java.lang.Object groupValue(int group)
Specified by:
groupValue in interface SequenceMatchResult<T>
Overrides:
groupValue in class BasicSequenceMatchResult<T>

groupInfo

public SequenceMatchResult.MatchedGroupInfo<T> groupInfo(int group)
Specified by:
groupInfo in interface SequenceMatchResult<T>
Overrides:
groupInfo in class BasicSequenceMatchResult<T>

groupMatchResults

public java.util.List<java.lang.Object> groupMatchResults(int group)
Description copied from interface: SequenceMatchResult
Returns a list of Objects representing the match results for the nodes in the group.

Specified by:
groupMatchResults in interface SequenceMatchResult<T>
Overrides:
groupMatchResults in class BasicSequenceMatchResult<T>
Parameters:
group - The index of a capturing group in this matcher's pattern
Returns:
the list of match results associated with the nodes for the captured group.

groupMatchResult

public java.lang.Object groupMatchResult(int group,
                                         int index)
Description copied from interface: SequenceMatchResult
Returns an Object representing the result for the match for a particular node in a group. (actual Object returned depends on the type T of the nodes. For instance, for a CoreMap, the match result is returned as a Map, while for String, the match result is typically a MatchResult.

Specified by:
groupMatchResult in interface SequenceMatchResult<T>
Overrides:
groupMatchResult in class BasicSequenceMatchResult<T>
Parameters:
group - The index of a capturing group in this matcher's pattern
index - The index of the element in the captured subsequence.
Returns:
the match result associated with the node at the given index for the captured group.

nodeMatchResult

public java.lang.Object nodeMatchResult(int index)
Description copied from interface: SequenceMatchResult
Returns an Object representing the result for the match for a particular node. (actual Object returned depends on the type T of the nodes. For instance, for a CoreMap, the match result is returned as a Map, while for String, the match result is typically a MatchResult.

Specified by:
nodeMatchResult in interface SequenceMatchResult<T>
Overrides:
nodeMatchResult in class BasicSequenceMatchResult<T>
Parameters:
index - The index of the element in the original sequence.
Returns:
the match result associated with the node at the given index.

reset

public void reset()
Clears matcher - Clears matched groups, reset region to be entire sequence


get

public T get(int i)
Returns the ith element

Parameters:
i - - index
Returns:
ith element


Stanford NLP Group