Script for pulling each line in a set of PDFs that contains a matching string from a list of specified strings. Initial use case was to enable being able to quickly pull out rows in tables that contain a match. See example/
.
Note: These instructions are written for MacOS (>= Sierra) and assumes basic command line familiarity
Assuming you have homebrew
installed, install pdfgrep
:
brew install pdfgrep
This make take a minute or two.
- Download
search_pdfs.sh
and put it into the same directory as the PDFs you want to search. cd
into that directory and runchmod u+x search-pdfs.sh
to make it executable.- Create a text file in the directory called
search_strings.txt
. It should contain all the strings you want to search, one string per line. See the examplesearch_strings.txt
. - Run
./search-pdfs.sh
. - Once it completes, there should be a new file called
results.txt
in the directory containing all the matched lines. See the exampleresults.txt
.