Class DoubleMetaphone

  • All Implemented Interfaces:
    Encoder, StringEncoder

    public class DoubleMetaphone
    extends java.lang.Object
    implements StringEncoder
    Encodes a string into a double metaphone value. This Implementation is based on the algorithm by Lawrence Philips.

    This class is conditionally thread-safe. The instance field for the maximum code length is mutable setMaxCodeLen(int) but is not volatile, and accesses are not synchronized. If an instance of the class is shared between threads, the caller needs to ensure that suitable synchronization is used to ensure safe publication of the value between threads, and must not invoke setMaxCodeLen(int) after initial setup.

    See Also:
    Original Article, http://en.wikipedia.org/wiki/Metaphone
    • Field Detail

      • VOWELS

        private static final java.lang.String VOWELS
        "Vowels" to test for
        See Also:
        Constant Field Values
      • SILENT_START

        private static final java.lang.String[] SILENT_START
        Prefixes when present which are not pronounced
      • L_R_N_M_B_H_F_V_W_SPACE

        private static final java.lang.String[] L_R_N_M_B_H_F_V_W_SPACE
      • ES_EP_EB_EL_EY_IB_IL_IN_IE_EI_ER

        private static final java.lang.String[] ES_EP_EB_EL_EY_IB_IL_IN_IE_EI_ER
      • L_T_K_S_N_M_B_Z

        private static final java.lang.String[] L_T_K_S_N_M_B_Z
      • maxCodeLen

        private int maxCodeLen
        Maximum length of an encoding, default is 4
    • Constructor Detail

      • DoubleMetaphone

        public DoubleMetaphone()
        Creates an instance of this DoubleMetaphone encoder
    • Method Detail

      • doubleMetaphone

        public java.lang.String doubleMetaphone​(java.lang.String value)
        Encode a value with Double Metaphone.
        Parameters:
        value - String to encode
        Returns:
        an encoded string
      • doubleMetaphone

        public java.lang.String doubleMetaphone​(java.lang.String value,
                                                boolean alternate)
        Encode a value with Double Metaphone, optionally using the alternate encoding.
        Parameters:
        value - String to encode
        alternate - use alternate encode
        Returns:
        an encoded string
      • encode

        public java.lang.Object encode​(java.lang.Object obj)
                                throws EncoderException
        Encode the value using DoubleMetaphone. It will only work if obj is a String (like Metaphone).
        Specified by:
        encode in interface Encoder
        Parameters:
        obj - Object to encode (should be of type String)
        Returns:
        An encoded Object (will be of type String)
        Throws:
        EncoderException - encode parameter is not of type String
      • encode

        public java.lang.String encode​(java.lang.String value)
        Encode the value using DoubleMetaphone.
        Specified by:
        encode in interface StringEncoder
        Parameters:
        value - String to encode
        Returns:
        An encoded String
      • isDoubleMetaphoneEqual

        public boolean isDoubleMetaphoneEqual​(java.lang.String value1,
                                              java.lang.String value2)
        Check if the Double Metaphone values of two String values are equal.
        Parameters:
        value1 - The left-hand side of the encoded String.equals(Object).
        value2 - The right-hand side of the encoded String.equals(Object).
        Returns:
        true if the encoded Strings are equal; false otherwise.
        See Also:
        isDoubleMetaphoneEqual(String,String,boolean)
      • isDoubleMetaphoneEqual

        public boolean isDoubleMetaphoneEqual​(java.lang.String value1,
                                              java.lang.String value2,
                                              boolean alternate)
        Check if the Double Metaphone values of two String values are equal, optionally using the alternate value.
        Parameters:
        value1 - The left-hand side of the encoded String.equals(Object).
        value2 - The right-hand side of the encoded String.equals(Object).
        alternate - use the alternate value if true.
        Returns:
        true if the encoded Strings are equal; false otherwise.
      • getMaxCodeLen

        public int getMaxCodeLen()
        Returns the maxCodeLen.
        Returns:
        int
      • setMaxCodeLen

        public void setMaxCodeLen​(int maxCodeLen)
        Sets the maxCodeLen.
        Parameters:
        maxCodeLen - The maxCodeLen to set
      • conditionC0

        private boolean conditionC0​(java.lang.String value,
                                    int index)
        Complex condition 0 for 'C'.
      • conditionCH0

        private boolean conditionCH0​(java.lang.String value,
                                     int index)
        Complex condition 0 for 'CH'.
      • conditionCH1

        private boolean conditionCH1​(java.lang.String value,
                                     int index)
        Complex condition 1 for 'CH'.
      • conditionL0

        private boolean conditionL0​(java.lang.String value,
                                    int index)
        Complex condition 0 for 'L'.
      • conditionM0

        private boolean conditionM0​(java.lang.String value,
                                    int index)
        Complex condition 0 for 'M'.
      • isSlavoGermanic

        private boolean isSlavoGermanic​(java.lang.String value)
        Determines whether or not a value is of slavo-germanic origin. A value is of slavo-germanic origin if it contians any of 'W', 'K', 'CZ', or 'WITZ'.
      • isVowel

        private boolean isVowel​(char ch)
        Determines whether or not a character is a vowel or not
      • isSilentStart

        private boolean isSilentStart​(java.lang.String value)
        Determines whether or not the value starts with a silent letter. It will return true if the value starts with any of 'GN', 'KN', 'PN', 'WR' or 'PS'.
      • cleanInput

        private java.lang.String cleanInput​(java.lang.String input)
        Cleans the input.
      • charAt

        protected char charAt​(java.lang.String value,
                              int index)
      • contains

        protected static boolean contains​(java.lang.String value,
                                          int start,
                                          int length,
                                          java.lang.String... criteria)