Class BaseNCodec

  • All Implemented Interfaces:
    BinaryDecoder, BinaryEncoder, Decoder, Encoder
    Direct Known Subclasses:
    Base16, Base32, Base64

    public abstract class BaseNCodec
    extends java.lang.Object
    implements BinaryEncoder, BinaryDecoder
    Abstract superclass for Base-N encoders and decoders.

    This class is thread-safe.

    You can set the decoding behavior when the input bytes contain leftover trailing bits that cannot be created by a valid encoding. These can be bits that are unused from the final character or entire characters. The default mode is lenient decoding.
    • Lenient: Any trailing bits are composed into 8-bit bytes where possible. The remainder are discarded.
    • Strict: The decoding will raise an IllegalArgumentException if trailing bits are not part of a valid encoding. Any unused bits from the final character must be zero. Impossible counts of entire final characters are not allowed.

    When strict decoding is enabled it is expected that the decoded bytes will be re-encoded to a byte array that matches the original, i.e. no changes occur on the final character. This requires that the input bytes use the same padding and alphabet as the encoder.

    • Nested Class Summary

      Nested Classes 
      Modifier and Type Class Description
      (package private) static class  BaseNCodec.Context
      Holds thread context so classes can be thread-safe.
    • Field Summary

      Fields 
      Modifier and Type Field Description
      (package private) static byte[] CHUNK_SEPARATOR
      Chunk separator per RFC 2045 section 2.1.
      private int chunkSeparatorLength
      Size of chunk separator.
      protected static CodecPolicy DECODING_POLICY_DEFAULT
      The default decoding policy.
      private CodecPolicy decodingPolicy
      Defines the decoding behavior when the input bytes contain leftover trailing bits that cannot be created by a valid encoding.
      private static int DEFAULT_BUFFER_RESIZE_FACTOR  
      private static int DEFAULT_BUFFER_SIZE
      Defines the default buffer size - currently 8192 - must be large enough for at least one encoded block+separator
      private int encodedBlockSize
      Number of bytes in each full block of encoded data, e.g.
      (package private) static int EOF
      EOF
      protected int lineLength
      Chunksize for encoding.
      protected static int MASK_8BITS
      Mask used to extract 8 bits, used in decoding bytes
      private static int MAX_BUFFER_SIZE
      The maximum size buffer to allocate.
      static int MIME_CHUNK_SIZE
      MIME chunk size per RFC 2045 section 6.8.
      protected byte pad  
      protected byte PAD
      Deprecated.
      Use pad.
      protected static byte PAD_DEFAULT
      Byte used to pad output.
      static int PEM_CHUNK_SIZE
      PEM chunk size per RFC 1421 section 4.3.2.4.
      private int unencodedBlockSize
      Number of bytes in each full block of unencoded data, e.g.
    • Constructor Summary

      Constructors 
      Modifier Constructor Description
      protected BaseNCodec​(int unencodedBlockSize, int encodedBlockSize, int lineLength, int chunkSeparatorLength)
      Note lineLength is rounded down to the nearest multiple of the encoded block size.
      protected BaseNCodec​(int unencodedBlockSize, int encodedBlockSize, int lineLength, int chunkSeparatorLength, byte pad)
      Note lineLength is rounded down to the nearest multiple of the encoded block size.
      protected BaseNCodec​(int unencodedBlockSize, int encodedBlockSize, int lineLength, int chunkSeparatorLength, byte pad, CodecPolicy decodingPolicy)
      Note lineLength is rounded down to the nearest multiple of the encoded block size.
    • Method Summary

      All Methods Static Methods Instance Methods Abstract Methods Concrete Methods 
      Modifier and Type Method Description
      (package private) int available​(BaseNCodec.Context context)
      Returns the amount of buffered data available for reading.
      private static int compareUnsigned​(int x, int y)
      Compares two int values numerically treating the values as unsigned.
      protected boolean containsAlphabetOrPad​(byte[] arrayOctet)
      Tests a given byte array to see if it contains any characters within the alphabet or PAD.
      private static int createPositiveCapacity​(int minCapacity)
      Create a positive capacity at least as large the minimum required capacity.
      byte[] decode​(byte[] pArray)
      Decodes a byte[] containing characters in the Base-N alphabet.
      (package private) abstract void decode​(byte[] pArray, int i, int length, BaseNCodec.Context context)  
      java.lang.Object decode​(java.lang.Object obj)
      Decodes an Object using the Base-N algorithm.
      byte[] decode​(java.lang.String pArray)
      Decodes a String containing characters in the Base-N alphabet.
      byte[] encode​(byte[] pArray)
      Encodes a byte[] containing binary data, into a byte[] containing characters in the alphabet.
      byte[] encode​(byte[] pArray, int offset, int length)
      Encodes a byte[] containing binary data, into a byte[] containing characters in the alphabet.
      (package private) abstract void encode​(byte[] pArray, int i, int length, BaseNCodec.Context context)  
      java.lang.Object encode​(java.lang.Object obj)
      Encodes an Object using the Base-N algorithm.
      java.lang.String encodeAsString​(byte[] pArray)
      Encodes a byte[] containing binary data, into a String containing characters in the appropriate alphabet.
      java.lang.String encodeToString​(byte[] pArray)
      Encodes a byte[] containing binary data, into a String containing characters in the Base-N alphabet.
      protected byte[] ensureBufferSize​(int size, BaseNCodec.Context context)
      Ensure that the buffer has room for size bytes
      static byte[] getChunkSeparator()
      Gets a copy of the chunk separator per RFC 2045 section 2.1.
      CodecPolicy getCodecPolicy()
      Returns the decoding behavior policy.
      protected int getDefaultBufferSize()
      Get the default buffer size.
      long getEncodedLength​(byte[] pArray)
      Calculates the amount of space needed to encode the supplied array.
      (package private) boolean hasData​(BaseNCodec.Context context)
      Returns true if this object has buffered data for reading.
      protected abstract boolean isInAlphabet​(byte value)
      Returns whether or not the octet is in the current alphabet.
      boolean isInAlphabet​(byte[] arrayOctet, boolean allowWSPad)
      Tests a given byte array to see if it contains only valid characters within the alphabet.
      boolean isInAlphabet​(java.lang.String basen)
      Tests a given String to see if it contains only valid characters within the alphabet.
      boolean isStrictDecoding()
      Returns true if decoding behavior is strict.
      protected static boolean isWhiteSpace​(byte byteToCheck)
      Checks if a byte value is whitespace or not.
      (package private) int readResults​(byte[] b, int bPos, int bAvail, BaseNCodec.Context context)
      Extracts buffered data into the provided byte[] array, starting at position bPos, up to a maximum of bAvail bytes.
      private static byte[] resizeBuffer​(BaseNCodec.Context context, int minCapacity)
      Increases our buffer by the DEFAULT_BUFFER_RESIZE_FACTOR.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Field Detail

      • MIME_CHUNK_SIZE

        public static final int MIME_CHUNK_SIZE
        MIME chunk size per RFC 2045 section 6.8.

        The 76 character limit does not count the trailing CRLF, but counts all other characters, including any equal signs.

        See Also:
        RFC 2045 section 6.8, Constant Field Values
      • PEM_CHUNK_SIZE

        public static final int PEM_CHUNK_SIZE
        PEM chunk size per RFC 1421 section 4.3.2.4.

        The 64 character limit does not count the trailing CRLF, but counts all other characters, including any equal signs.

        See Also:
        RFC 1421 section 4.3.2.4, Constant Field Values
      • DEFAULT_BUFFER_RESIZE_FACTOR

        private static final int DEFAULT_BUFFER_RESIZE_FACTOR
        See Also:
        Constant Field Values
      • DEFAULT_BUFFER_SIZE

        private static final int DEFAULT_BUFFER_SIZE
        Defines the default buffer size - currently 8192 - must be large enough for at least one encoded block+separator
        See Also:
        Constant Field Values
      • MAX_BUFFER_SIZE

        private static final int MAX_BUFFER_SIZE
        The maximum size buffer to allocate.

        This is set to the same size used in the JDK java.util.ArrayList:

        Some VMs reserve some header words in an array. Attempts to allocate larger arrays may result in OutOfMemoryError: Requested array size exceeds VM limit.
        See Also:
        Constant Field Values
      • MASK_8BITS

        protected static final int MASK_8BITS
        Mask used to extract 8 bits, used in decoding bytes
        See Also:
        Constant Field Values
      • PAD_DEFAULT

        protected static final byte PAD_DEFAULT
        Byte used to pad output.
        See Also:
        Constant Field Values
      • DECODING_POLICY_DEFAULT

        protected static final CodecPolicy DECODING_POLICY_DEFAULT
        The default decoding policy.
        Since:
        1.15
      • CHUNK_SEPARATOR

        static final byte[] CHUNK_SEPARATOR
        Chunk separator per RFC 2045 section 2.1.
        See Also:
        RFC 2045 section 2.1
      • PAD

        @Deprecated
        protected final byte PAD
        Deprecated.
        Use pad. Will be removed in 2.0.
        See Also:
        Constant Field Values
      • pad

        protected final byte pad
      • unencodedBlockSize

        private final int unencodedBlockSize
        Number of bytes in each full block of unencoded data, e.g. 4 for Base64 and 5 for Base32
      • encodedBlockSize

        private final int encodedBlockSize
        Number of bytes in each full block of encoded data, e.g. 3 for Base64 and 8 for Base32
      • lineLength

        protected final int lineLength
        Chunksize for encoding. Not used when decoding. A value of zero or less implies no chunking of the encoded data. Rounded down to nearest multiple of encodedBlockSize.
      • chunkSeparatorLength

        private final int chunkSeparatorLength
        Size of chunk separator. Not used unless lineLength > 0.
      • decodingPolicy

        private final CodecPolicy decodingPolicy
        Defines the decoding behavior when the input bytes contain leftover trailing bits that cannot be created by a valid encoding. These can be bits that are unused from the final character or entire characters. The default mode is lenient decoding. Set this to true to enable strict decoding.
        • Lenient: Any trailing bits are composed into 8-bit bytes where possible. The remainder are discarded.
        • Strict: The decoding will raise an IllegalArgumentException if trailing bits are not part of a valid encoding. Any unused bits from the final character must be zero. Impossible counts of entire final characters are not allowed.

        When strict decoding is enabled it is expected that the decoded bytes will be re-encoded to a byte array that matches the original, i.e. no changes occur on the final character. This requires that the input bytes use the same padding and alphabet as the encoder.

    • Constructor Detail

      • BaseNCodec

        protected BaseNCodec​(int unencodedBlockSize,
                             int encodedBlockSize,
                             int lineLength,
                             int chunkSeparatorLength)
        Note lineLength is rounded down to the nearest multiple of the encoded block size. If chunkSeparatorLength is zero, then chunking is disabled.
        Parameters:
        unencodedBlockSize - the size of an unencoded block (e.g. Base64 = 3)
        encodedBlockSize - the size of an encoded block (e.g. Base64 = 4)
        lineLength - if > 0, use chunking with a length lineLength
        chunkSeparatorLength - the chunk separator length, if relevant
      • BaseNCodec

        protected BaseNCodec​(int unencodedBlockSize,
                             int encodedBlockSize,
                             int lineLength,
                             int chunkSeparatorLength,
                             byte pad)
        Note lineLength is rounded down to the nearest multiple of the encoded block size. If chunkSeparatorLength is zero, then chunking is disabled.
        Parameters:
        unencodedBlockSize - the size of an unencoded block (e.g. Base64 = 3)
        encodedBlockSize - the size of an encoded block (e.g. Base64 = 4)
        lineLength - if > 0, use chunking with a length lineLength
        chunkSeparatorLength - the chunk separator length, if relevant
        pad - byte used as padding byte.
      • BaseNCodec

        protected BaseNCodec​(int unencodedBlockSize,
                             int encodedBlockSize,
                             int lineLength,
                             int chunkSeparatorLength,
                             byte pad,
                             CodecPolicy decodingPolicy)
        Note lineLength is rounded down to the nearest multiple of the encoded block size. If chunkSeparatorLength is zero, then chunking is disabled.
        Parameters:
        unencodedBlockSize - the size of an unencoded block (e.g. Base64 = 3)
        encodedBlockSize - the size of an encoded block (e.g. Base64 = 4)
        lineLength - if > 0, use chunking with a length lineLength
        chunkSeparatorLength - the chunk separator length, if relevant
        pad - byte used as padding byte.
        decodingPolicy - Decoding policy.
        Since:
        1.15
    • Method Detail

      • compareUnsigned

        private static int compareUnsigned​(int x,
                                           int y)
        Compares two int values numerically treating the values as unsigned. Taken from JDK 1.8.

        TODO: Replace with JDK 1.8 Integer::compareUnsigned(int, int).

        Parameters:
        x - the first int to compare
        y - the second int to compare
        Returns:
        the value 0 if x == y; a value less than 0 if x < y as unsigned values; and a value greater than 0 if x > y as unsigned values
      • createPositiveCapacity

        private static int createPositiveCapacity​(int minCapacity)
        Create a positive capacity at least as large the minimum required capacity. If the minimum capacity is negative then this throws an OutOfMemoryError as no array can be allocated.
        Parameters:
        minCapacity - the minimum capacity
        Returns:
        the capacity
        Throws:
        java.lang.OutOfMemoryError - if the minCapacity is negative
      • getChunkSeparator

        public static byte[] getChunkSeparator()
        Gets a copy of the chunk separator per RFC 2045 section 2.1.
        Returns:
        the chunk separator
        Since:
        1.15
        See Also:
        RFC 2045 section 2.1
      • isWhiteSpace

        protected static boolean isWhiteSpace​(byte byteToCheck)
        Checks if a byte value is whitespace or not. Whitespace is taken to mean: space, tab, CR, LF
        Parameters:
        byteToCheck - the byte to check
        Returns:
        true if byte is whitespace, false otherwise
      • resizeBuffer

        private static byte[] resizeBuffer​(BaseNCodec.Context context,
                                           int minCapacity)
        Increases our buffer by the DEFAULT_BUFFER_RESIZE_FACTOR.
        Parameters:
        context - the context to be used
        minCapacity - the minimum required capacity
        Returns:
        the resized byte[] buffer
        Throws:
        java.lang.OutOfMemoryError - if the minCapacity is negative
      • available

        int available​(BaseNCodec.Context context)
        Returns the amount of buffered data available for reading.
        Parameters:
        context - the context to be used
        Returns:
        The amount of buffered data available for reading.
      • containsAlphabetOrPad

        protected boolean containsAlphabetOrPad​(byte[] arrayOctet)
        Tests a given byte array to see if it contains any characters within the alphabet or PAD. Intended for use in checking line-ending arrays
        Parameters:
        arrayOctet - byte array to test
        Returns:
        true if any byte is a valid character in the alphabet or PAD; false otherwise
      • decode

        public byte[] decode​(byte[] pArray)
        Decodes a byte[] containing characters in the Base-N alphabet.
        Specified by:
        decode in interface BinaryDecoder
        Parameters:
        pArray - A byte array containing Base-N character data
        Returns:
        a byte array containing binary data
      • decode

        abstract void decode​(byte[] pArray,
                             int i,
                             int length,
                             BaseNCodec.Context context)
      • decode

        public java.lang.Object decode​(java.lang.Object obj)
                                throws DecoderException
        Decodes an Object using the Base-N algorithm. This method is provided in order to satisfy the requirements of the Decoder interface, and will throw a DecoderException if the supplied object is not of type byte[] or String.
        Specified by:
        decode in interface Decoder
        Parameters:
        obj - Object to decode
        Returns:
        An object (of type byte[]) containing the binary data which corresponds to the byte[] or String supplied.
        Throws:
        DecoderException - if the parameter supplied is not of type byte[]
      • decode

        public byte[] decode​(java.lang.String pArray)
        Decodes a String containing characters in the Base-N alphabet.
        Parameters:
        pArray - A String containing Base-N character data
        Returns:
        a byte array containing binary data
      • encode

        public byte[] encode​(byte[] pArray)
        Encodes a byte[] containing binary data, into a byte[] containing characters in the alphabet.
        Specified by:
        encode in interface BinaryEncoder
        Parameters:
        pArray - a byte array containing binary data
        Returns:
        A byte array containing only the base N alphabetic character data
      • encode

        public byte[] encode​(byte[] pArray,
                             int offset,
                             int length)
        Encodes a byte[] containing binary data, into a byte[] containing characters in the alphabet.
        Parameters:
        pArray - a byte array containing binary data
        offset - initial offset of the subarray.
        length - length of the subarray.
        Returns:
        A byte array containing only the base N alphabetic character data
        Since:
        1.11
      • encode

        abstract void encode​(byte[] pArray,
                             int i,
                             int length,
                             BaseNCodec.Context context)
      • encode

        public java.lang.Object encode​(java.lang.Object obj)
                                throws EncoderException
        Encodes an Object using the Base-N algorithm. This method is provided in order to satisfy the requirements of the Encoder interface, and will throw an EncoderException if the supplied object is not of type byte[].
        Specified by:
        encode in interface Encoder
        Parameters:
        obj - Object to encode
        Returns:
        An object (of type byte[]) containing the Base-N encoded data which corresponds to the byte[] supplied.
        Throws:
        EncoderException - if the parameter supplied is not of type byte[]
      • encodeAsString

        public java.lang.String encodeAsString​(byte[] pArray)
        Encodes a byte[] containing binary data, into a String containing characters in the appropriate alphabet. Uses UTF8 encoding.
        Parameters:
        pArray - a byte array containing binary data
        Returns:
        String containing only character data in the appropriate alphabet.
        Since:
        1.5 This is a duplicate of encodeToString(byte[]); it was merged during refactoring.
      • encodeToString

        public java.lang.String encodeToString​(byte[] pArray)
        Encodes a byte[] containing binary data, into a String containing characters in the Base-N alphabet. Uses UTF8 encoding.
        Parameters:
        pArray - a byte array containing binary data
        Returns:
        A String containing only Base-N character data
      • ensureBufferSize

        protected byte[] ensureBufferSize​(int size,
                                          BaseNCodec.Context context)
        Ensure that the buffer has room for size bytes
        Parameters:
        size - minimum spare space required
        context - the context to be used
        Returns:
        the buffer
      • getCodecPolicy

        public CodecPolicy getCodecPolicy()
        Returns the decoding behavior policy.

        The default is lenient. If the decoding policy is strict, then decoding will raise an IllegalArgumentException if trailing bits are not part of a valid encoding. Decoding will compose trailing bits into 8-bit bytes and discard the remainder.

        Returns:
        true if using strict decoding
        Since:
        1.15
      • getDefaultBufferSize

        protected int getDefaultBufferSize()
        Get the default buffer size. Can be overridden.
        Returns:
        the default buffer size.
      • getEncodedLength

        public long getEncodedLength​(byte[] pArray)
        Calculates the amount of space needed to encode the supplied array.
        Parameters:
        pArray - byte[] array which will later be encoded
        Returns:
        amount of space needed to encoded the supplied array. Returns a long since a max-len array will require > Integer.MAX_VALUE
      • hasData

        boolean hasData​(BaseNCodec.Context context)
        Returns true if this object has buffered data for reading.
        Parameters:
        context - the context to be used
        Returns:
        true if there is data still available for reading.
      • isInAlphabet

        protected abstract boolean isInAlphabet​(byte value)
        Returns whether or not the octet is in the current alphabet. Does not allow whitespace or pad.
        Parameters:
        value - The value to test
        Returns:
        true if the value is defined in the current alphabet, false otherwise.
      • isInAlphabet

        public boolean isInAlphabet​(byte[] arrayOctet,
                                    boolean allowWSPad)
        Tests a given byte array to see if it contains only valid characters within the alphabet. The method optionally treats whitespace and pad as valid.
        Parameters:
        arrayOctet - byte array to test
        allowWSPad - if true, then whitespace and PAD are also allowed
        Returns:
        true if all bytes are valid characters in the alphabet or if the byte array is empty; false, otherwise
      • isInAlphabet

        public boolean isInAlphabet​(java.lang.String basen)
        Tests a given String to see if it contains only valid characters within the alphabet. The method treats whitespace and PAD as valid.
        Parameters:
        basen - String to test
        Returns:
        true if all characters in the String are valid characters in the alphabet or if the String is empty; false, otherwise
        See Also:
        isInAlphabet(byte[], boolean)
      • isStrictDecoding

        public boolean isStrictDecoding()
        Returns true if decoding behavior is strict. Decoding will raise an IllegalArgumentException if trailing bits are not part of a valid encoding.

        The default is false for lenient decoding. Decoding will compose trailing bits into 8-bit bytes and discard the remainder.

        Returns:
        true if using strict decoding
        Since:
        1.15
      • readResults

        int readResults​(byte[] b,
                        int bPos,
                        int bAvail,
                        BaseNCodec.Context context)
        Extracts buffered data into the provided byte[] array, starting at position bPos, up to a maximum of bAvail bytes. Returns how many bytes were actually extracted.

        Package protected for access from I/O streams.

        Parameters:
        b - byte[] array to extract the buffered data into.
        bPos - position in byte[] array to start extraction at.
        bAvail - amount of bytes we're allowed to extract. We may extract fewer (if fewer are available).
        context - the context to be used
        Returns:
        The number of bytes successfully extracted into the provided byte[] array.