Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PDFs protected from Adobe Acrobat online get corrupted when saved with lopdf #256

Open
shantanugoel opened this issue Jan 1, 2024 · 3 comments

Comments

@shantanugoel
Copy link

Any files password protected from Adobe acrobat online, fail to open in any pdf viewer after saving from lopdf, even if the operation is as simple as opening and then saving it without performing any other op on them.
The issue could be for any files generated by acrobat, but I dont have a paid account so can only test the ones I generated from their free protection offering.
test_protected.pdf
test_protected_lopdfsaved.pdf
test_orig.pdf

Attaching files:

  • test_orig.pdf - Original file
  • test_orig_protected.pdf - acrobat generated
  • test_orig_protected_lopdfsaved.pdf - saved by lopdf

The last file is corrupt. Password is "aaaaaa".
The files save well from other libs and utilities, with or without decrypting them, (I tried pdf-rs, qpdf, pdfium, etc)

@shantanugoel
Copy link
Author

A relevant piece of info that I found while debugging it was that these files have v 4, r 4, which lopdf doesn't currently support for decryption (aes128). I added that support in my local branch of lopdf, but then found that the issue happens even with or without decryption.

@RossAnder
Copy link

Hi @J-F-Liu I'm facing a similar issue. The file seems to have AES128 encryption but is not password protected. I'm hoping you can advise what sort of crypto capabilities lopdf has. I see pdf-rs seems to have some capabilities in this regard but those crates are not really dependable.

Here's the debug out from the get_encrypted function performed on the Document struct:
<</Filter /Standard/Length 128/V 4/R 4/U <17b478e015e10c0d80cd94e92a11e9fc00000000000000000000000000000000>/O <47049aec225d12e5d35a369667756393aa48a808d49426128e08c55c657c3395>/P -4/StmF /StdCF/StrF /StdCF/EncryptMetadata false/CF <</StdCF <</Length 16/CFM /AESV2/AuthEvent /DocOpen>>>>>>

Here's the full debug output from the Document struct:
Document { version: "1.7", trailer: <</Root 18 0 R/Info 15 0 R/Encrypt 22 0 R/ID [<0bac900a9f44a170bd291323ebf2008b> <0bac900a9f44a170bd291323ebf2008b>]/Type /XRef/Size 26>>, reference_table: Xref { cross_reference_type: CrossReferenceStream, entries: {1: Compressed { container: 24, index: 0 }, 2: Normal { offset: 15, generation: 0 }, 4: Normal { offset: 53998, generation: 0 }, 6: Normal { offset: 54094, generation: 0 }, 7: Compressed { container: 24, index: 1 }, 9: Compressed { container: 24, index: 2 }, 10: Compressed { container: 24, index: 3 }, 11: Normal { offset: 61520, generation: 0 }, 13: Normal { offset: 61619, generation: 0 }, 14: Compressed { container: 24, index: 4 }, 15: Compressed { container: 24, index: 5 }, 16: Compressed { container: 24, index: 6 }, 17: Compressed { container: 24, index: 7 }, 18: Normal { offset: 68134, generation: 0 }, 22: Normal { offset: 68196, generation: 0 }, 23: Normal { offset: 68492, generation: 0 }, 24: Normal { offset: 69480, generation: 0 }, 25: Normal { offset: 70486, generation: 0 }}, size: 26 }, objects: {(2, 0): <</Subtype /Image/Width 656/Height 225/BitsPerComponent 8/ColorSpace /DeviceRGB/Filter /DCTDecode/Length 53840>>stream...endstream, (4, 0): <</Type /Page/Parent 1 0 R/Contents 6 0 R/Resources 7 0 R/MediaBox [0 0 612 792]>>, (6, 0): <</Length 7376>>stream...endstream, (11, 0): <</Type /Page/Parent 1 0 R/Contents 13 0 R/Resources 14 0 R/MediaBox [0 0 612 792]>>, (13, 0): <</Length 6464>>stream...endstream, (18, 0): <</Type /Catalog/Pages 1 0 R/Metadata 23 0 R>>, (22, 0): <</Filter /Standard/Length 128/V 4/R 4/U <17b478e015e10c0d80cd94e92a11e9fc00000000000000000000000000000000>/O <47049aec225d12e5d35a369667756393aa48a808d49426128e08c55c657c3395>/P -4/StmF /StdCF/StrF /StdCF/EncryptMetadata false/CF <</StdCF <</Length 16/CFM /AESV2/AuthEvent /DocOpen>>>>>>, (23, 0): <</Type /Metadata/Subtype /XML/Filter /Crypt/Length 899>>stream...endstream, (24, 0): <</Type /ObjStm/N 8/First 54/Length 0>>stream...endstream, (25, 0): <</Root 18 0 R/Info 15 0 R/Encrypt 22 0 R/ID [<0bac900a9f44a170bd291323ebf2008b> <0bac900a9f44a170bd291323ebf2008b>]/Type /XRef/W [1 4 2]/Filter /FlateDecode/Index [0 26]/Size 26/Length 108>>stream...endstream}, max_id: 25, max_bookmark_id: 0, bookmarks: [], bookmark_table: {}, xref_start: 70486 }

@RossAnder
Copy link

@shantanugoel I've used the mupdf crate (a safe wrapper around mupdf) to create a function that cleans and removes encryption from these troublesome files. I'll just drop the code with full crate paths in case you would be open to a workaround until lopdf can handle this encryption.

fn mu_parse(path: &str) -> std::io::Result<Vec<u8>> {
    let doc = mupdf::pdf::document::PdfDocument::open(path).unwrap();
    let mut buffer: Vec<u8> = Vec::new();

    let mut options = mupdf::pdf::document::PdfWriteOptions::default();
    options
        .set_encryption(mupdf::pdf::Encryption::None)
        .set_clean(true)
        .set_sanitize(true)
        .set_pretty(true);
    doc.write_to_with_options(&mut buffer, options).unwrap();

    Ok(buffer)
}

This allows me to handle lopdf encryption errors by passing the file to the above function and the output back to lopdf.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants