Fix ZipInputStream cannot read STORED method from ZipZap #155

liuliu · 2016-12-23T18:30:20Z

This PR tried to fix the problem with ZipZap generated STORED format with ZipInputStream.

ZipInputStream is a widely used mechanism from JDK that supports reading Zip files. Since it deals with a stream of data, there are limitations in its implementation. For example, ZipInputStream cannot leverage the data from central file header to offset its payload. This is OK for deflated stream because deflated stream is self-descriptive, thus, by inflating the stream, it will naturally reach the end of the stream, and then pick up the data descriptor from the end of the payload for crc32 etc. However, if the payload is in STORED format, ZipInputStream relies on the local file header to provide information such as the uncompressed size, which in ZipZap case, skipped.

This works fine for other Zip file reader implementation or Java's own ZipFile since most of them are using central file header as the source of truth.

This PR fixed the problem by having the correct uncompressed / compressed size / crc32 code at the local file header if STORED format is used, and in that case, don't use data descriptor.

However, since we can only get the payload length at zero cause for data block payload, it is left to be desired about how to implement this for NSOutputStream / CGDataConsumerRef.

Please advice. Also, cc @rsanchezsaez

pixelglow · 2017-01-04T15:57:57Z

Excellent idea, proposal and implementation!

This change is actually generally useful as a slight optimization for smaller zip files and follows neatly on from @ed84124.

Only a couple of issues to fix. Please rebase all your changes to a single commit:

Minor nit: line 108 comment should mention data and no compression.
Issue: lines 278 - 304: logic here is sound but embeds another dependency on dataBlock and compressionLevel == 0 in the data writing code, which is hard to figure out when we next update this to support e.g. always outputting size + crc in local header. Prefer to separate out the header update logic in a separate phase before the data writing i.e. a separate if (!_compressionLevel && _dataBlock) {...} code. Once this code updates the header with the size + crc, then it should set a flag to inhibit the data descriptor write at line 337, instead of having to recheck _dataBlock, generalPurposeBitFlag etc.

Fix ZipInputStream cannot read STORED method from ZipZap

b1888d0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix ZipInputStream cannot read STORED method from ZipZap #155

Fix ZipInputStream cannot read STORED method from ZipZap #155

liuliu commented Dec 23, 2016

pixelglow commented Jan 4, 2017

Fix ZipInputStream cannot read STORED method from ZipZap #155

Are you sure you want to change the base?

Fix ZipInputStream cannot read STORED method from ZipZap #155

Conversation

liuliu commented Dec 23, 2016

pixelglow commented Jan 4, 2017