|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||
java.lang.Objectedu.stanford.nlp.ie.QuantifiableEntityNormalizer
public class QuantifiableEntityNormalizer
Various methods for normalizing Money, Date, Percent, Time, and
Number, Ordinal amounts.
These matchers are generous in that they try to quantify something
that's already been labelled by an NER system; don't use them to make
classification decisions. This class has a twin in the pipeline world:
QuantifiableEntityNormalizingAnnotator.
Please keep the substantive content here, however, so as to lessen code
duplication.
Implementation note: The extensive test code for this class is now in a separate JUnit Test class. This class depends on the background symbol for NER being the default background symbol. This should be fixed at some point.
| Field Summary | |
|---|---|
static java.lang.String |
BACKGROUND_SYMBOL
|
static java.util.regex.Pattern |
numberPattern
|
static ClassicCounter<java.lang.String> |
ordinalsToValues
|
static ClassicCounter<java.lang.String> |
wordsToValues
|
| Method Summary | ||
|---|---|---|
static
|
addNormalizedQuantitiesToEntities(java.util.List<E> l)
Identifies contiguous MONEY, TIME, DATE, or PERCENT entities and tags each of their consitituents with a "normalizedQuantity" label which contains the appropriate normalized string corresponding to the full quantity. |
|
static
|
addNormalizedQuantitiesToEntities(java.util.List<E> list,
boolean concatenate)
Identifies contiguous MONEY, TIME, DATE, or PERCENT entities and tags each of their consitituents with a "normalizedQuantity" label which contains the appropriate normalized string corresponding to the full quantity. |
|
static
|
applySpecializedNER(java.util.List<E> l)
Runs a deterministic named entity classifier which is good at recognizing numbers and money and date expressions not recognized by our statistical NER. |
|
static java.util.List<CoreLabel> |
collapseNERLabels(java.util.List<CoreLabel> l)
Currently this populates a List<CoreLabel> with words from the passed List, but NER entities are collapsed and CoreLabel constituents of entities have
NER information in their "quantity" fields. |
|
static java.util.List<java.util.List<CoreLabel>> |
normalizeClassifierOutput(java.util.List<java.util.List<CoreLabel>> l)
Takes the output of an AbstractSequenceClassifier and marks up
each document by normalizing quantities. |
|
static java.lang.String |
normalizedNumberString(java.lang.String s,
java.lang.String nextWord,
java.lang.Number numberFromSUTime)
|
|
static java.lang.String |
normalizedNumberStringQuiet(java.lang.String s,
double multiplier,
java.lang.String nextWord,
java.lang.Number numberFromSUTime)
|
|
static java.lang.String |
normalizedOrdinalString(java.lang.String s,
java.lang.Number numberFromSUTime)
|
|
static java.lang.String |
normalizedOrdinalStringQuiet(java.lang.String s,
java.lang.Number numberFromSUTime)
|
|
static java.lang.String |
normalizedPercentString(java.lang.String s,
java.lang.Number numberFromSUTime)
|
|
static java.lang.String |
normalizedTimeString(java.lang.String s,
java.lang.String ampm,
Timex timexFromSUTime)
|
|
static java.lang.String |
normalizedTimeString(java.lang.String s,
Timex timexFromSUTime)
|
|
static
|
singleEntityToString(java.util.List<E> l)
Convert the content of a List of CoreMaps to a single space-separated String. |
|
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Field Detail |
|---|
public static java.lang.String BACKGROUND_SYMBOL
public static final ClassicCounter<java.lang.String> wordsToValues
public static final ClassicCounter<java.lang.String> ordinalsToValues
public static final java.util.regex.Pattern numberPattern
| Method Detail |
|---|
public static <E extends CoreMap> java.lang.String singleEntityToString(java.util.List<E> l)
l - The List
public static java.util.List<CoreLabel> collapseNERLabels(java.util.List<CoreLabel> l)
CoreLabel constituents of entities have
NER information in their "quantity" fields.
NOTE: This now seems to be used nowhere. The collapsing is done elsewhere. That's probably appropriate; it doesn't seem like this should be part of QuantifiableEntityNormalizer, since it's set to collapse non-quantifiable entities....
l - a list of CoreLabels with NER labels,
public static java.lang.String normalizedTimeString(java.lang.String s,
Timex timexFromSUTime)
public static java.lang.String normalizedTimeString(java.lang.String s,
java.lang.String ampm,
Timex timexFromSUTime)
public static java.lang.String normalizedNumberString(java.lang.String s,
java.lang.String nextWord,
java.lang.Number numberFromSUTime)
public static java.lang.String normalizedNumberStringQuiet(java.lang.String s,
double multiplier,
java.lang.String nextWord,
java.lang.Number numberFromSUTime)
public static java.lang.String normalizedOrdinalString(java.lang.String s,
java.lang.Number numberFromSUTime)
public static java.lang.String normalizedOrdinalStringQuiet(java.lang.String s,
java.lang.Number numberFromSUTime)
public static java.lang.String normalizedPercentString(java.lang.String s,
java.lang.Number numberFromSUTime)
public static java.util.List<java.util.List<CoreLabel>> normalizeClassifierOutput(java.util.List<java.util.List<CoreLabel>> l)
AbstractSequenceClassifier and marks up
each document by normalizing quantities. Each CoreLabel in any
of the documents which is normalizable will receive a "normalizedQuantity"
attribute.
l - a List of Lists of CoreLabels
public static <E extends CoreMap> void addNormalizedQuantitiesToEntities(java.util.List<E> l)
l - A list of CoreMaps representing a single
document. Note: the Labels are updated in place.
public static <E extends CoreMap> void addNormalizedQuantitiesToEntities(java.util.List<E> list,
boolean concatenate)
list - A list of CoreMaps representing a single
document. Note: the Labels are updated in place.concatenate - true if quantities should be concatenated into one label, false otherwisepublic static <E extends CoreLabel> java.util.List<E> applySpecializedNER(java.util.List<E> l)
l - A document to label
|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||