| Class | Description |
|---|---|
| AnnotatedTextReader |
CanNOT handle overlapping labeled text (that is one token cannot belong to
multiple labels)! Note that there has to be spaces around the tags for the reader to work correctly!
|
| ApplyPatternsMulti | |
| ConstantsAndVariables | |
| CreatePatterns | |
| Data | |
| EditDistanceDamerauLevenshteinLike |
COPIED FROM https://gist.github.com/steveash (public domain license)
Implementation of the OSA (optimal string alignment) which is similar
to the Damerau-Levenshtein in that it allows for transpositions to
count as a single edit distance, but is not a true metric and can
over-estimate the cost because it disallows substrings to edited more than
once.
|
| GetPatternsFromDataMultiClass |
Given text and a seed list, this class gives more words like the seed words
by learning surface word patterns.
|
| GetPatternsFromDataMultiClass.LabelWithSeedWords | |
| InvertedIndexByTokens |
Creates an inverted index of (word or lemma) => {file1 => {sentid1,
sentid2,..
|
| LearnImportantFeatures |
The idea is that you can learn features that are important using ML algorithm
and use those features in learning weights for patterns.
|
| PatternsAnnotations | |
| PatternsAnnotations.Features | |
| PatternsAnnotations.MatchedPattern | |
| PatternsAnnotations.MatchedPhrases | |
| PatternsAnnotations.OtherSemanticLabel | |
| PatternsAnnotations.PatternLabel1 | |
| PatternsAnnotations.PatternLabel10 | |
| PatternsAnnotations.PatternLabel2 | |
| PatternsAnnotations.PatternLabel3 | |
| PatternsAnnotations.PatternLabel4 | |
| PatternsAnnotations.PatternLabel5 | |
| PatternsAnnotations.PatternLabel6 | |
| PatternsAnnotations.PatternLabel7 | |
| PatternsAnnotations.PatternLabel8 | |
| PatternsAnnotations.PatternLabel9 | |
| PatternToken |
Class to represent a target phrase.
|
| PhraseScorer | |
| ScorePatterns | |
| ScorePatternsF1 |
Used if patternScoring flag is set to F1 with the seed pattern.
|
| ScorePatternsFreqBased | |
| ScorePatternsRatioModifiedFreq | |
| ScorePhrases | |
| ScorePhrasesAverageFeatures |
Score phrases by averaging scores of individual features.
|
| SurfacePattern |
To represent a surface pattern in more detail than TokenSequencePattern (this
class object is eventually compiled as TokenSequencePattern via the toString
method).
|
| Enum | Description |
|---|---|
| ConstantsAndVariables.ScorePhraseMeasures | |
| GetPatternsFromDataMultiClass.PatternScoring |
RlogF is from Riloff 1996, when R's denominator is (pos+neg+unlabeled)
|
| PhraseScorer.Normalization | |
| SurfacePattern.Genre |