Class MatchRatingApproachEncoder

    • Field Summary

      Fields 
      Modifier and Type Field Description
      private static java.lang.String[] DOUBLE_CONSONANT  
      private static int ELEVEN
      Constants used mainly for the min rating value.
      private static java.lang.String EMPTY  
      private static int FIVE
      Constants used mainly for the min rating value.
      private static int FOUR
      Constants used mainly for the min rating value.
      private static int ONE
      Constants used mainly for the min rating value.
      private static java.lang.String PLAIN_ASCII
      The plain letter equivalent of the accented letters.
      private static int SEVEN
      Constants used mainly for the min rating value.
      private static int SIX
      Constants used mainly for the min rating value.
      private static java.lang.String SPACE  
      private static int THREE
      Constants used mainly for the min rating value.
      private static int TWELVE
      Constants used mainly for the min rating value.
      private static int TWO
      Constants used mainly for the min rating value.
      private static java.lang.String UNICODE
      Unicode characters corresponding to various accented letters.
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      (package private) java.lang.String cleanName​(java.lang.String name)
      Cleans up a name: 1.
      java.lang.Object encode​(java.lang.Object pObject)
      Encodes an Object using the Match Rating Approach algorithm.
      java.lang.String encode​(java.lang.String name)
      Encodes a String using the Match Rating Approach (MRA) algorithm.
      (package private) java.lang.String getFirst3Last3​(java.lang.String name)
      Gets the first and last 3 letters of a name (if > 6 characters) Else just returns the name.
      (package private) int getMinRating​(int sumLength)
      Obtains the min rating of the length sum of the 2 names.
      boolean isEncodeEquals​(java.lang.String name1, java.lang.String name2)
      Determines if two names are homophonous via Match Rating Approach (MRA) algorithm.
      (package private) boolean isVowel​(java.lang.String letter)
      Determines if a letter is a vowel.
      (package private) int leftToRightThenRightToLeftProcessing​(java.lang.String name1, java.lang.String name2)
      Processes the names from left to right (first) then right to left removing identical letters in same positions.
      (package private) java.lang.String removeAccents​(java.lang.String accentedWord)
      Removes accented letters and replaces with non-accented ascii equivalent Case is preserved.
      (package private) java.lang.String removeDoubleConsonants​(java.lang.String name)
      Replaces any double consonant pair with the single letter equivalent.
      (package private) java.lang.String removeVowels​(java.lang.String name)
      Deletes all vowels unless the vowel begins the word.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Field Detail

      • ONE

        private static final int ONE
        Constants used mainly for the min rating value.
        See Also:
        Constant Field Values
      • TWO

        private static final int TWO
        Constants used mainly for the min rating value.
        See Also:
        Constant Field Values
      • THREE

        private static final int THREE
        Constants used mainly for the min rating value.
        See Also:
        Constant Field Values
      • FOUR

        private static final int FOUR
        Constants used mainly for the min rating value.
        See Also:
        Constant Field Values
      • FIVE

        private static final int FIVE
        Constants used mainly for the min rating value.
        See Also:
        Constant Field Values
      • SIX

        private static final int SIX
        Constants used mainly for the min rating value.
        See Also:
        Constant Field Values
      • SEVEN

        private static final int SEVEN
        Constants used mainly for the min rating value.
        See Also:
        Constant Field Values
      • ELEVEN

        private static final int ELEVEN
        Constants used mainly for the min rating value.
        See Also:
        Constant Field Values
      • TWELVE

        private static final int TWELVE
        Constants used mainly for the min rating value.
        See Also:
        Constant Field Values
      • PLAIN_ASCII

        private static final java.lang.String PLAIN_ASCII
        The plain letter equivalent of the accented letters.
        See Also:
        Constant Field Values
      • UNICODE

        private static final java.lang.String UNICODE
        Unicode characters corresponding to various accented letters. For example: Ú is U acute etc...
        See Also:
        Constant Field Values
      • DOUBLE_CONSONANT

        private static final java.lang.String[] DOUBLE_CONSONANT
    • Constructor Detail

      • MatchRatingApproachEncoder

        public MatchRatingApproachEncoder()
    • Method Detail

      • cleanName

        java.lang.String cleanName​(java.lang.String name)
        Cleans up a name: 1. Upper-cases everything 2. Removes some common punctuation 3. Removes accents 4. Removes any spaces.

        API Usage

        Consider this method private, it is package protected for unit testing only.

        Parameters:
        name - The name to be cleaned
        Returns:
        The cleaned name
      • encode

        public final java.lang.Object encode​(java.lang.Object pObject)
                                      throws EncoderException
        Encodes an Object using the Match Rating Approach algorithm. Method is here to satisfy the requirements of the Encoder interface Throws an EncoderException if input object is not of type java.lang.String.
        Specified by:
        encode in interface Encoder
        Parameters:
        pObject - Object to encode
        Returns:
        An object (or type java.lang.String) containing the Match Rating Approach code which corresponds to the String supplied.
        Throws:
        EncoderException - if the parameter supplied is not of type java.lang.String
      • encode

        public final java.lang.String encode​(java.lang.String name)
        Encodes a String using the Match Rating Approach (MRA) algorithm.
        Specified by:
        encode in interface StringEncoder
        Parameters:
        name - String object to encode
        Returns:
        The MRA code corresponding to the String supplied
      • getFirst3Last3

        java.lang.String getFirst3Last3​(java.lang.String name)
        Gets the first and last 3 letters of a name (if > 6 characters) Else just returns the name.

        API Usage

        Consider this method private, it is package protected for unit testing only.

        Parameters:
        name - The string to get the substrings from
        Returns:
        Annexed first and last 3 letters of input word.
      • getMinRating

        int getMinRating​(int sumLength)
        Obtains the min rating of the length sum of the 2 names. In essence the larger the sum length the smaller the min rating. Values strictly from documentation.

        API Usage

        Consider this method private, it is package protected for unit testing only.

        Parameters:
        sumLength - The length of 2 strings sent down
        Returns:
        The min rating value
      • isEncodeEquals

        public boolean isEncodeEquals​(java.lang.String name1,
                                      java.lang.String name2)
        Determines if two names are homophonous via Match Rating Approach (MRA) algorithm. It should be noted that the strings are cleaned in the same way as encode(String).
        Parameters:
        name1 - First of the 2 strings (names) to compare
        name2 - Second of the 2 names to compare
        Returns:
        true if the encodings are identical false otherwise.
      • isVowel

        boolean isVowel​(java.lang.String letter)
        Determines if a letter is a vowel.

        API Usage

        Consider this method private, it is package protected for unit testing only.

        Parameters:
        letter - The letter under investiagtion
        Returns:
        True if a vowel, else false
      • leftToRightThenRightToLeftProcessing

        int leftToRightThenRightToLeftProcessing​(java.lang.String name1,
                                                 java.lang.String name2)
        Processes the names from left to right (first) then right to left removing identical letters in same positions. Then subtracts the longer string that remains from 6 and returns this.

        API Usage

        Consider this method private, it is package protected for unit testing only.

        Parameters:
        name1 - name2
        Returns:
        the length as above
      • removeAccents

        java.lang.String removeAccents​(java.lang.String accentedWord)
        Removes accented letters and replaces with non-accented ascii equivalent Case is preserved. http://www.codecodex.com/wiki/Remove_accent_from_letters_%28ex_.%C3%A9_to_e%29
        Parameters:
        accentedWord - The word that may have accents in it.
        Returns:
        De-accented word
      • removeDoubleConsonants

        java.lang.String removeDoubleConsonants​(java.lang.String name)
        Replaces any double consonant pair with the single letter equivalent.

        API Usage

        Consider this method private, it is package protected for unit testing only.

        Parameters:
        name - String to have double consonants removed
        Returns:
        Single consonant word
      • removeVowels

        java.lang.String removeVowels​(java.lang.String name)
        Deletes all vowels unless the vowel begins the word.

        API Usage

        Consider this method private, it is package protected for unit testing only.

        Parameters:
        name - The name to have vowels removed
        Returns:
        De-voweled word