edu.stanford.nlp.wordseg
Class ChineseSegmenterFeatureFactory<IN extends CoreLabel>

java.lang.Object
  extended by edu.stanford.nlp.sequences.FeatureFactory<IN>
      extended by edu.stanford.nlp.wordseg.ChineseSegmenterFeatureFactory<IN>
All Implemented Interfaces:
java.io.Serializable

public class ChineseSegmenterFeatureFactory<IN extends CoreLabel>
extends FeatureFactory<IN>
implements java.io.Serializable

A Chinese segmenter Feature Factory for GALE project. (modified from Sighan Bakeoff 2005.) This is supposed to have all the good closed-track features from Sighan bakeoff 2005, and some other "open-track" features This will also be used to do a character-based chunking!

c is Chinese character ("char"). c means current, n means next and p means previous.

FeatureTemplates
Current position clique
useWord1CONSTANT, cc, nc, pc, pc+cc, if (As|Msr|Pk|Hk) cc+nc, pc,nc

Author:
Huihsin Tseng, Pichuan Chang
See Also:
Serialized Form

Field Summary
 
Fields inherited from class edu.stanford.nlp.sequences.FeatureFactory
cliqueC, cliqueCnC, cliqueCp2C, cliqueCp3C, cliqueCp4C, cliqueCp5C, cliqueCpC, cliqueCpCnC, cliqueCpCp2C, cliqueCpCp2Cp3C, cliqueCpCp2Cp3Cp4C, cliqueCpCp2Cp3Cp4Cp5C, flags, knownCliques
 
Constructor Summary
ChineseSegmenterFeatureFactory()
           
 
Method Summary
protected  java.util.Collection<java.lang.String> featuresC(PaddedList<IN> cInfo, int loc)
           
protected  java.util.Collection<java.lang.String> featuresCnC(PaddedList<IN> cInfo, int loc)
           
protected  java.util.Collection<java.lang.String> featuresCpC(PaddedList<IN> cInfo, int loc)
           
 java.util.Collection<java.lang.String> getCliqueFeatures(PaddedList<IN> cInfo, int loc, Clique clique)
          Extracts all the features from the input data at a certain index.
 void init(SeqClassifierFlags flags)
           
 
Methods inherited from class edu.stanford.nlp.sequences.FeatureFactory
addAllInterningAndSuffixing, getCliques, getCliques, getWord
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

ChineseSegmenterFeatureFactory

public ChineseSegmenterFeatureFactory()
Method Detail

init

public void init(SeqClassifierFlags flags)
Overrides:
init in class FeatureFactory<IN extends CoreLabel>

getCliqueFeatures

public java.util.Collection<java.lang.String> getCliqueFeatures(PaddedList<IN> cInfo,
                                                                int loc,
                                                                Clique clique)
Extracts all the features from the input data at a certain index.

Specified by:
getCliqueFeatures in class FeatureFactory<IN extends CoreLabel>
Parameters:
cInfo - The complete data set as a List of WordInfo
loc - The index at which to extract features.
clique - The particular clique for which to extract features. It should be a member of the knownCliques list.
Returns:
A Collection of the features calculated for the word at the specified position in info.

featuresC

protected java.util.Collection<java.lang.String> featuresC(PaddedList<IN> cInfo,
                                                           int loc)

featuresCpC

protected java.util.Collection<java.lang.String> featuresCpC(PaddedList<IN> cInfo,
                                                             int loc)

featuresCnC

protected java.util.Collection<java.lang.String> featuresCnC(PaddedList<IN> cInfo,
                                                             int loc)


Stanford NLP Group