Skip to content

Commit

Permalink
add a sample spec with invisible text
Browse files Browse the repository at this point in the history
This PDF sets the alpha constant to 0 for some of the text. For now,
pdf-reader ignores the alpha constant and extracts all text.

There's a possibility the text extraction will gain an option to ignore
invisible characters though, in which case this sample will come in
handy.

See #43
  • Loading branch information
yob committed Oct 18, 2019
1 parent e78f1ec commit ee7a572
Show file tree
Hide file tree
Showing 2 changed files with 3 additions and 0 deletions.
Binary file added spec/data/invisible.pdf
Binary file not shown.
3 changes: 3 additions & 0 deletions spec/integrity.yml
Original file line number Diff line number Diff line change
Expand Up @@ -209,6 +209,9 @@ data/invalid/trailer_is_not_a_dict.pdf:
data/invalid/trailer_root_is_not_a_dict.pdf:
:bytes: 108807
:md5: e00b7fc6999ca722aa31d2bb90f1e5d0
data/invisible.pdf:
:bytes: 14364
:md5: 563f7ec8eb2c4d54f00fd85d807580c0
data/junk_prefix.pdf:
:bytes: 934
:md5: c03c9a96cdefa78c9475b619d87e39bf
Expand Down

0 comments on commit ee7a572

Please sign in to comment.