Class PhoneticEngine
- java.lang.Object
-
- org.apache.commons.codec.language.bm.PhoneticEngine
-
public class PhoneticEngine extends java.lang.Object
Converts words into potential phonetic representations.This is a two-stage process. Firstly, the word is converted into a phonetic representation that takes into account the likely source language. Next, this phonetic representation is converted into a pan-European 'average' representation, allowing comparison between different versions of essentially the same word from different languages.
This class is intentionally immutable and thread-safe. If you wish to alter the settings for a PhoneticEngine, you must make a new one with the updated settings.
Ported from phoneticengine.php
- Since:
- 1.6
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description (package private) static class
PhoneticEngine.PhonemeBuilder
Utility for manipulating a set of phonemes as they are being built up.private static class
PhoneticEngine.RulesApplication
A function closure capturing the application of a list of rules to an input sequence at a particular offset.
-
Field Summary
Fields Modifier and Type Field Description private boolean
concat
private static int
DEFAULT_MAX_PHONEMES
private Lang
lang
private int
maxPhonemes
private static java.util.Map<NameType,java.util.Set<java.lang.String>>
NAME_PREFIXES
private NameType
nameType
private RuleType
ruleType
-
Constructor Summary
Constructors Constructor Description PhoneticEngine(NameType nameType, RuleType ruleType, boolean concat)
Generates a new, fully-configured phonetic engine.PhoneticEngine(NameType nameType, RuleType ruleType, boolean concat, int maxPhonemes)
Generates a new, fully-configured phonetic engine.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description private PhoneticEngine.PhonemeBuilder
applyFinalRules(PhoneticEngine.PhonemeBuilder phonemeBuilder, java.util.Map<java.lang.String,java.util.List<Rule>> finalRules)
Applies the final rules to convert from a language-specific phonetic representation to a language-independent representation.java.lang.String
encode(java.lang.String input)
Encodes a string to its phonetic representation.java.lang.String
encode(java.lang.String input, Languages.LanguageSet languageSet)
Encodes an input string into an output phonetic representation, given a set of possible origin languages.Lang
getLang()
Gets the Lang language guessing rules being used.int
getMaxPhonemes()
Gets the maximum number of phonemes the engine will calculate for a given input.NameType
getNameType()
Gets the NameType being used.RuleType
getRuleType()
Gets the RuleType being used.boolean
isConcat()
Gets if multiple phonetic encodings are concatenated or if just the first one is kept.private static java.lang.String
join(java.lang.Iterable<java.lang.String> strings, java.lang.String sep)
Joins some strings with an internal separator.
-
-
-
Field Detail
-
NAME_PREFIXES
private static final java.util.Map<NameType,java.util.Set<java.lang.String>> NAME_PREFIXES
-
DEFAULT_MAX_PHONEMES
private static final int DEFAULT_MAX_PHONEMES
- See Also:
- Constant Field Values
-
lang
private final Lang lang
-
nameType
private final NameType nameType
-
ruleType
private final RuleType ruleType
-
concat
private final boolean concat
-
maxPhonemes
private final int maxPhonemes
-
-
Constructor Detail
-
PhoneticEngine
public PhoneticEngine(NameType nameType, RuleType ruleType, boolean concat)
Generates a new, fully-configured phonetic engine.- Parameters:
nameType
- the type of names it will useruleType
- the type of rules it will applyconcat
- if it will concatenate multiple encodings
-
PhoneticEngine
public PhoneticEngine(NameType nameType, RuleType ruleType, boolean concat, int maxPhonemes)
Generates a new, fully-configured phonetic engine.- Parameters:
nameType
- the type of names it will useruleType
- the type of rules it will applyconcat
- if it will concatenate multiple encodingsmaxPhonemes
- the maximum number of phonemes that will be handled- Since:
- 1.7
-
-
Method Detail
-
join
private static java.lang.String join(java.lang.Iterable<java.lang.String> strings, java.lang.String sep)
Joins some strings with an internal separator.- Parameters:
strings
- Strings to joinsep
- String to separate them with- Returns:
- a single String consisting of each element of
strings
interleaved bysep
-
applyFinalRules
private PhoneticEngine.PhonemeBuilder applyFinalRules(PhoneticEngine.PhonemeBuilder phonemeBuilder, java.util.Map<java.lang.String,java.util.List<Rule>> finalRules)
Applies the final rules to convert from a language-specific phonetic representation to a language-independent representation.- Parameters:
phonemeBuilder
- the current phonemesfinalRules
- the final rules to apply- Returns:
- the resulting phonemes
-
encode
public java.lang.String encode(java.lang.String input)
Encodes a string to its phonetic representation.- Parameters:
input
- the String to encode- Returns:
- the encoding of the input
-
encode
public java.lang.String encode(java.lang.String input, Languages.LanguageSet languageSet)
Encodes an input string into an output phonetic representation, given a set of possible origin languages.- Parameters:
input
- String to phoneticise; a String with dashes or spaces separating each wordlanguageSet
- set of possible origin languages- Returns:
- a phonetic representation of the input; a String containing '-'-separated phonetic representations of the input
-
getLang
public Lang getLang()
Gets the Lang language guessing rules being used.- Returns:
- the Lang in use
-
getNameType
public NameType getNameType()
Gets the NameType being used.- Returns:
- the NameType in use
-
getRuleType
public RuleType getRuleType()
Gets the RuleType being used.- Returns:
- the RuleType in use
-
isConcat
public boolean isConcat()
Gets if multiple phonetic encodings are concatenated or if just the first one is kept.- Returns:
- true if multiple phonetic encodings are returned, false if just the first is
-
getMaxPhonemes
public int getMaxPhonemes()
Gets the maximum number of phonemes the engine will calculate for a given input.- Returns:
- the maximum number of phonemes
- Since:
- 1.7
-
-