|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||
java.lang.Objectedu.stanford.nlp.international.arabic.IBMArabicEscaper
public class IBMArabicEscaper
This escaper is intended for use on flat input to be parsed by LexicalizedParser.
It performs these functions functions:
ArabicTreeNormalizer
This class supports both Buckwalter and UTF-8 encoding.
IMPORTANT: This class must implement Function
in order to run with the parser.
, List
| Constructor Summary | |
|---|---|
IBMArabicEscaper()
|
|
IBMArabicEscaper(boolean annoteAndClassOnly)
|
|
| Method Summary | |
|---|---|
java.util.List<HasWord> |
apply(java.util.List<HasWord> sentence)
Converts an input list of HasWord in IBM Arabic to
LDC ATBv3 representation. |
java.lang.String |
apply(java.lang.String w)
Applies escaping to a single word. |
void |
disableWarnings()
Disable warnings generated when tokens are escaped. |
static void |
main(java.lang.String[] args)
This main method preprocesses one-sentence-per-line input, making the same changes as the Function. |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Constructor Detail |
|---|
public IBMArabicEscaper()
public IBMArabicEscaper(boolean annoteAndClassOnly)
| Method Detail |
|---|
public void disableWarnings()
public java.util.List<HasWord> apply(java.util.List<HasWord> sentence)
HasWord in IBM Arabic to
LDC ATBv3 representation. The method safely copies the input object
prior to escaping.
apply in interface Function<java.util.List<HasWord>,java.util.List<HasWord>>sentence - A collection of type Word
java.lang.RuntimeException - If a word is mapped to nullpublic java.lang.String apply(java.lang.String w)
w - The word
java.lang.RuntimeException - If a word is nullified (which is really bad for the parser and
for MT)
public static void main(java.lang.String[] args)
throws java.io.IOException
.sent appended to their names. If you give the flag
-f then output is instead sent to stdout. Input and output
is always in UTF-8.
args - A list of filenames. The files must be UTF-8 encoded.
java.io.IOException - If there are any issues
|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||