public class ZipArchiveInputStream extends ArchiveInputStream implements InputStreamStatistics
As of Apache Commons Compress it transparently supports Zip64 extensions and thus individual entries and archives larger than 4 GB or with more than 65536 entries.
The ZipFile
class is preferred when reading from files
as ZipArchiveInputStream
is limited by not being able to
read the central directory header before returning entries. In
particular ZipArchiveInputStream
ZipFile
Modifier and Type | Class and Description |
---|---|
private class |
ZipArchiveInputStream.BoundedInputStream
Bounded input stream adapted from commons-io
|
private static class |
ZipArchiveInputStream.CurrentEntry
Structure collecting information for the entry that is
currently being read.
|
Modifier and Type | Field and Description |
---|---|
private boolean |
allowStoredEntriesWithDataDescriptor
Whether the stream will try to read STORED entries that use a data descriptor.
|
private static byte[] |
APK_SIGNING_BLOCK_MAGIC |
private java.nio.ByteBuffer |
buf
Buffer used to read from the wrapped stream.
|
private static byte[] |
CFH |
private static int |
CFH_LEN |
private boolean |
closed
Whether the stream has been closed.
|
private ZipArchiveInputStream.CurrentEntry |
current
The entry that is currently being read.
|
private static byte[] |
DD |
(package private) java.lang.String |
encoding |
private int |
entriesRead |
private boolean |
hitCentralDirectory
Whether the stream has reached the central directory - and thus found all entries.
|
private java.io.InputStream |
in
Wrapped stream, will always be a PushbackInputStream.
|
private java.util.zip.Inflater |
inf
Inflater used for all deflated entries.
|
private java.io.ByteArrayInputStream |
lastStoredEntry
When reading a stored entry that uses the data descriptor this
stream has to read the full entry and caches it.
|
private static byte[] |
LFH |
private static int |
LFH_LEN |
private byte[] |
lfhBuf |
private static java.math.BigInteger |
LONG_MAX |
private byte[] |
shortBuf |
private byte[] |
skipBuf |
private static long |
TWO_EXP_32 |
private byte[] |
twoDwordBuf |
private long |
uncompressedCount
Count decompressed bytes for current entry
|
private boolean |
useUnicodeExtraFields
Whether to look for and use Unicode extra fields.
|
private byte[] |
wordBuf |
private ZipEncoding |
zipEncoding
The zip encoding to use for filenames and the file comment.
|
Constructor and Description |
---|
ZipArchiveInputStream(java.io.InputStream inputStream)
Create an instance using UTF-8 encoding
|
ZipArchiveInputStream(java.io.InputStream inputStream,
java.lang.String encoding)
Create an instance using the specified encoding
|
ZipArchiveInputStream(java.io.InputStream inputStream,
java.lang.String encoding,
boolean useUnicodeExtraFields)
Create an instance using the specified encoding
|
ZipArchiveInputStream(java.io.InputStream inputStream,
java.lang.String encoding,
boolean useUnicodeExtraFields,
boolean allowStoredEntriesWithDataDescriptor)
Create an instance using the specified encoding
|
Modifier and Type | Method and Description |
---|---|
private boolean |
bufferContainsSignature(java.io.ByteArrayOutputStream bos,
int offset,
int lastRead,
int expectedDDLen)
Checks whether the current buffer contains the signature of a
"data descriptor", "local file header" or
"central directory entry".
|
private int |
cacheBytesRead(java.io.ByteArrayOutputStream bos,
int offset,
int lastRead,
int expecteDDLen)
If the last read bytes could hold a data descriptor and an
incomplete signature then save the last bytes to the front of
the buffer and cache everything in front of the potential data
descriptor into the given ByteArrayOutputStream.
|
boolean |
canReadEntryData(ArchiveEntry ae)
Whether this class is able to read the given entry.
|
private static boolean |
checksig(byte[] signature,
byte[] expected) |
void |
close() |
private void |
closeEntry()
Closes the current ZIP archive entry and positions the underlying
stream to the beginning of the next entry.
|
private boolean |
currentEntryHasOutstandingBytes()
If the compressed size of the current entry is included in the entry header
and there are any outstanding bytes in the underlying stream, then
this returns true.
|
private void |
drainCurrentEntryData()
Read all data of the current entry from the underlying stream
that hasn't been read, yet.
|
private int |
fill() |
private void |
findEocdRecord()
Reads forward until the signature of the "End of central
directory" record is found.
|
private long |
getBytesInflated()
Get the number of bytes Inflater has actually processed.
|
long |
getCompressedCount() |
ArchiveEntry |
getNextEntry()
Returns the next Archive Entry in this Stream.
|
ZipArchiveEntry |
getNextZipEntry() |
long |
getUncompressedCount() |
private boolean |
isApkSigningBlock(byte[] suspectLocalFileHeader)
Checks whether this might be an APK Signing Block.
|
private boolean |
isFirstByteOfEocdSig(int b) |
static boolean |
matches(byte[] signature,
int length)
Checks if the signature matches what is expected for a zip file.
|
private void |
processZip64Extra(ZipLong size,
ZipLong cSize)
Records whether a Zip64 extra is present and sets the size
information from it if sizes are 0xFFFFFFFF and the entry
doesn't use a data descriptor.
|
private void |
pushback(byte[] buf,
int offset,
int length) |
int |
read(byte[] buffer,
int offset,
int length) |
private void |
readDataDescriptor() |
private int |
readDeflated(byte[] buffer,
int offset,
int length)
Implementation of read for DEFLATED entries.
|
private void |
readFirstLocalFileHeader(byte[] lfh)
Fills the given array with the first local file header and
deals with splitting/spanning markers that may prefix the first
LFH.
|
private int |
readFromInflater(byte[] buffer,
int offset,
int length)
Potentially reads more bytes to fill the inflater's buffer and
reads from it.
|
private void |
readFully(byte[] b) |
private void |
readFully(byte[] b,
int off) |
private int |
readOneByte()
Reads bytes by reading from the underlying stream rather than
the (potentially inflating) archive stream - which
read(byte[], int, int) would do. |
private int |
readStored(byte[] buffer,
int offset,
int length)
Implementation of read for STORED entries.
|
private void |
readStoredEntry()
Caches a stored entry that uses the data descriptor.
|
private void |
realSkip(long value)
Skips bytes by reading from the underlying stream rather than
the (potentially inflating) archive stream - which
skip(long) would do. |
long |
skip(long value)
Skips over and discards value bytes of data from this input
stream.
|
private void |
skipRemainderOfArchive()
Reads the stream until it find the "End of central directory
record" and consumes it as well.
|
private boolean |
supportsCompressedSizeFor(ZipArchiveEntry entry)
Whether the compressed size for the entry is either known or
not required by the compression method being used.
|
private boolean |
supportsDataDescriptorFor(ZipArchiveEntry entry)
Whether this entry requires a data descriptor this library can work with.
|
count, count, getBytesRead, getCount, pushedBackBytes, read
private final ZipEncoding zipEncoding
final java.lang.String encoding
private final boolean useUnicodeExtraFields
private final java.io.InputStream in
private final java.util.zip.Inflater inf
private final java.nio.ByteBuffer buf
private ZipArchiveInputStream.CurrentEntry current
private boolean closed
private boolean hitCentralDirectory
private java.io.ByteArrayInputStream lastStoredEntry
private boolean allowStoredEntriesWithDataDescriptor
private long uncompressedCount
private static final int LFH_LEN
private static final int CFH_LEN
private static final long TWO_EXP_32
private final byte[] lfhBuf
private final byte[] skipBuf
private final byte[] shortBuf
private final byte[] wordBuf
private final byte[] twoDwordBuf
private int entriesRead
private static final byte[] LFH
private static final byte[] CFH
private static final byte[] DD
private static final byte[] APK_SIGNING_BLOCK_MAGIC
private static final java.math.BigInteger LONG_MAX
public ZipArchiveInputStream(java.io.InputStream inputStream)
inputStream
- the stream to wrappublic ZipArchiveInputStream(java.io.InputStream inputStream, java.lang.String encoding)
inputStream
- the stream to wrapencoding
- the encoding to use for file names, use null
for the platform's default encodingpublic ZipArchiveInputStream(java.io.InputStream inputStream, java.lang.String encoding, boolean useUnicodeExtraFields)
inputStream
- the stream to wrapencoding
- the encoding to use for file names, use null
for the platform's default encodinguseUnicodeExtraFields
- whether to use InfoZIP Unicode
Extra Fields (if present) to set the file names.public ZipArchiveInputStream(java.io.InputStream inputStream, java.lang.String encoding, boolean useUnicodeExtraFields, boolean allowStoredEntriesWithDataDescriptor)
inputStream
- the stream to wrapencoding
- the encoding to use for file names, use null
for the platform's default encodinguseUnicodeExtraFields
- whether to use InfoZIP Unicode
Extra Fields (if present) to set the file names.allowStoredEntriesWithDataDescriptor
- whether the stream
will try to read STORED entries that use a data descriptorpublic ZipArchiveEntry getNextZipEntry() throws java.io.IOException
java.io.IOException
private void readFirstLocalFileHeader(byte[] lfh) throws java.io.IOException
java.io.IOException
private void processZip64Extra(ZipLong size, ZipLong cSize)
public ArchiveEntry getNextEntry() throws java.io.IOException
ArchiveInputStream
getNextEntry
in class ArchiveInputStream
null
if there are no more entriesjava.io.IOException
- if the next entry could not be readpublic boolean canReadEntryData(ArchiveEntry ae)
May return false if it is set up to use encryption or a compression method that hasn't been implemented yet.
canReadEntryData
in class ArchiveInputStream
ae
- the entry to testpublic int read(byte[] buffer, int offset, int length) throws java.io.IOException
read
in class java.io.InputStream
java.io.IOException
public long getCompressedCount()
getCompressedCount
in interface InputStreamStatistics
public long getUncompressedCount()
getUncompressedCount
in interface InputStreamStatistics
private int readStored(byte[] buffer, int offset, int length) throws java.io.IOException
java.io.IOException
private int readDeflated(byte[] buffer, int offset, int length) throws java.io.IOException
java.io.IOException
private int readFromInflater(byte[] buffer, int offset, int length) throws java.io.IOException
java.io.IOException
public void close() throws java.io.IOException
close
in interface java.io.Closeable
close
in interface java.lang.AutoCloseable
close
in class java.io.InputStream
java.io.IOException
public long skip(long value) throws java.io.IOException
This implementation may end up skipping over some smaller number of bytes, possibly 0, if and only if it reaches the end of the underlying stream.
The actual number of bytes skipped is returned.
skip
in class java.io.InputStream
value
- the number of bytes to be skipped.java.io.IOException
- - if an I/O error occurs.java.lang.IllegalArgumentException
- - if value is negative.public static boolean matches(byte[] signature, int length)
signature
- the bytes to checklength
- the number of bytes to checkprivate static boolean checksig(byte[] signature, byte[] expected)
private void closeEntry() throws java.io.IOException
If the compressed size of this entry is included in the entry header, then any outstanding bytes are simply skipped from the underlying stream without uncompressing them. This allows an entry to be safely closed even if the compression method is unsupported.
In case we don't know the compressed size of this entry or have already buffered too much data from the underlying stream to support uncompression, then the uncompression process is completed and the end position of the stream is adjusted based on the result of that process.
java.io.IOException
- if an error occursprivate boolean currentEntryHasOutstandingBytes()
private void drainCurrentEntryData() throws java.io.IOException
java.io.IOException
private long getBytesInflated()
for Java < Java7 the getBytes* methods in Inflater/Deflater seem to return unsigned ints rather than longs that start over with 0 at 2^32.
The stream knows how many bytes it has read, but not how many the Inflater actually consumed - it should be between the total number of bytes read for the entry and the total number minus the last read operation. Here we just try to make the value close enough to the bytes we've read by assuming the number of bytes consumed must be smaller than (or equal to) the number of bytes read but not smaller by more than 2^32.
private int fill() throws java.io.IOException
java.io.IOException
private void readFully(byte[] b) throws java.io.IOException
java.io.IOException
private void readFully(byte[] b, int off) throws java.io.IOException
java.io.IOException
private void readDataDescriptor() throws java.io.IOException
java.io.IOException
private boolean supportsDataDescriptorFor(ZipArchiveEntry entry)
private boolean supportsCompressedSizeFor(ZipArchiveEntry entry)
private void readStoredEntry() throws java.io.IOException
After calling this method the entry should know its size, the entry's data is cached and the stream is positioned at the next local file or central directory header.
java.io.IOException
private boolean bufferContainsSignature(java.io.ByteArrayOutputStream bos, int offset, int lastRead, int expectedDDLen) throws java.io.IOException
If it contains such a signature, reads the data descriptor and positions the stream right after the data descriptor.
java.io.IOException
private int cacheBytesRead(java.io.ByteArrayOutputStream bos, int offset, int lastRead, int expecteDDLen)
Data descriptor plus incomplete signature (3 bytes in the worst case) can be 20 bytes max.
private void pushback(byte[] buf, int offset, int length) throws java.io.IOException
java.io.IOException
private void skipRemainderOfArchive() throws java.io.IOException
java.io.IOException
private void findEocdRecord() throws java.io.IOException
java.io.IOException
private void realSkip(long value) throws java.io.IOException
skip(long)
would do.
Also updates bytes-read counter.java.io.IOException
private int readOneByte() throws java.io.IOException
read(byte[], int, int)
would do.
Also updates bytes-read counter.java.io.IOException
private boolean isFirstByteOfEocdSig(int b)
private boolean isApkSigningBlock(byte[] suspectLocalFileHeader) throws java.io.IOException
Unfortunately the APK signing block does not start with some kind of signature, it rather ends with one. It starts with a length, so what we do is parse the suspect length, skip ahead far enough, look for the signature and if we've found it, return true.
suspectLocalFileHeader
- the bytes read from the underlying stream in the expectation that they would hold
the local file header of the next entry.java.io.IOException