-
Hello everyone! Take this example: from docling.document_converter import DocumentConverter
conv_res = DocumentConverter().convert("https://pdfobject.com/pdf/sample.pdf")
print(conv_res.model_dump()["pages"]) prints:
My guess would be, that the PDF has been converted to an image with the given size and used as input for the OCR. I would like to draw the bounding boxes on this image- hence the question, if I can access it anywhere. TIA! |
Beta Was this translation helpful? Give feedback.
Answered by
AIMPED
Jan 7, 2025
Replies: 1 comment
-
Well, I found a way to access the images, but is there a better way to do so? from docling.datamodel.base_models import InputFormat
from docling.datamodel.pipeline_options import PdfPipelineOptions
from docling.document_converter import DocumentConverter, PdfFormatOption
pipeline_options = PdfPipelineOptions()
pipeline_options.generate_page_images = True
doc_converter = DocumentConverter(format_options={InputFormat.PDF: PdfFormatOption(pipeline_options=pipeline_options)})
conv_res = doc_converter.convert("https://pdfobject.com/pdf/sample.pdf")
image_strings = []
for k,v in conv_res.document.pages.items():
image_strings.append(v.image.uri.unicode_string()) |
Beta Was this translation helpful? Give feedback.
0 replies
Answer selected by
AIMPED
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Well, I found a way to access the images, but is there a better way to do so?