Class UserDictionary
- java.lang.Object
-
- org.apache.lucene.analysis.ko.dict.UserDictionary
-
- All Implemented Interfaces:
Dictionary
public final class UserDictionary extends java.lang.Object implements Dictionary
Class for building a User Dictionary. This class allows for adding custom nouns (세종) or compounds (세종시 세종 시).
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from interface org.apache.lucene.analysis.ko.dict.Dictionary
Dictionary.Morpheme
-
-
Field Summary
Fields Modifier and Type Field Description private TokenInfoFST
fst
private static short
LEFT_ID
private static short
RIGHT_ID
private static short
RIGHT_ID_F
private static short
RIGHT_ID_T
private short[]
rightIds
private int[][]
segmentations
private static int
WORD_COST
-
Constructor Summary
Constructors Modifier Constructor Description private
UserDictionary(java.util.List<java.lang.String> entries)
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description TokenInfoFST
getFST()
int
getLeftId(int wordId)
Get left id of specified wordPOS.Tag
getLeftPOS(int wordId)
Get the leftPOS.Tag
of specfied word.Dictionary.Morpheme[]
getMorphemes(int wordId, char[] surfaceForm, int off, int len)
Get the morphemes of specified word (e.g.POS.Type
getPOSType(int wordId)
Get thePOS.Type
of specified word (morpheme, compound, inflect or pre-analysis)java.lang.String
getReading(int wordId)
Get the reading of specified word (mainly used for Hanja to Hangul conversion).int
getRightId(int wordId)
Get right id of specified wordPOS.Tag
getRightPOS(int wordId)
Get the rightPOS.Tag
of specfied word.int
getWordCost(int wordId)
Get word cost of specified wordjava.util.List<java.lang.Integer>
lookup(char[] chars, int off, int len)
Lookup words in textstatic UserDictionary
open(java.io.Reader reader)
-
-
-
Field Detail
-
fst
private final TokenInfoFST fst
-
WORD_COST
private static final int WORD_COST
- See Also:
- Constant Field Values
-
LEFT_ID
private static final short LEFT_ID
- See Also:
- Constant Field Values
-
RIGHT_ID
private static final short RIGHT_ID
- See Also:
- Constant Field Values
-
RIGHT_ID_T
private static final short RIGHT_ID_T
- See Also:
- Constant Field Values
-
RIGHT_ID_F
private static final short RIGHT_ID_F
- See Also:
- Constant Field Values
-
segmentations
private final int[][] segmentations
-
rightIds
private final short[] rightIds
-
-
Method Detail
-
open
public static UserDictionary open(java.io.Reader reader) throws java.io.IOException
- Throws:
java.io.IOException
-
getFST
public TokenInfoFST getFST()
-
getLeftId
public int getLeftId(int wordId)
Description copied from interface:Dictionary
Get left id of specified word- Specified by:
getLeftId
in interfaceDictionary
-
getRightId
public int getRightId(int wordId)
Description copied from interface:Dictionary
Get right id of specified word- Specified by:
getRightId
in interfaceDictionary
-
getWordCost
public int getWordCost(int wordId)
Description copied from interface:Dictionary
Get word cost of specified word- Specified by:
getWordCost
in interfaceDictionary
-
getPOSType
public POS.Type getPOSType(int wordId)
Description copied from interface:Dictionary
Get thePOS.Type
of specified word (morpheme, compound, inflect or pre-analysis)- Specified by:
getPOSType
in interfaceDictionary
-
getLeftPOS
public POS.Tag getLeftPOS(int wordId)
Description copied from interface:Dictionary
Get the leftPOS.Tag
of specfied word. ForPOS.Type.MORPHEME
andPOS.Type.COMPOUND
the left and right POS are the same.- Specified by:
getLeftPOS
in interfaceDictionary
-
getRightPOS
public POS.Tag getRightPOS(int wordId)
Description copied from interface:Dictionary
Get the rightPOS.Tag
of specfied word. ForPOS.Type.MORPHEME
andPOS.Type.COMPOUND
the left and right POS are the same.- Specified by:
getRightPOS
in interfaceDictionary
-
getReading
public java.lang.String getReading(int wordId)
Description copied from interface:Dictionary
Get the reading of specified word (mainly used for Hanja to Hangul conversion).- Specified by:
getReading
in interfaceDictionary
-
getMorphemes
public Dictionary.Morpheme[] getMorphemes(int wordId, char[] surfaceForm, int off, int len)
Description copied from interface:Dictionary
Get the morphemes of specified word (e.g. 가깝으나: 가깝 + 으나).- Specified by:
getMorphemes
in interfaceDictionary
-
lookup
public java.util.List<java.lang.Integer> lookup(char[] chars, int off, int len) throws java.io.IOException
Lookup words in text- Parameters:
chars
- textoff
- offset into textlen
- length of text- Returns:
- array of wordId
- Throws:
java.io.IOException
-
-