Skip to content

Is there a way to access the Image Input of the OCR? #696

Answered by AIMPED
AIMPED asked this question in Q&A
Discussion options

You must be logged in to vote

Well, I found a way to access the images, but is there a better way to do so?

from docling.datamodel.base_models import InputFormat
from docling.datamodel.pipeline_options import PdfPipelineOptions
from docling.document_converter import DocumentConverter, PdfFormatOption

pipeline_options = PdfPipelineOptions()
pipeline_options.generate_page_images = True

doc_converter = DocumentConverter(format_options={InputFormat.PDF: PdfFormatOption(pipeline_options=pipeline_options)})

conv_res = doc_converter.convert("https://pdfobject.com/pdf/sample.pdf")
image_strings = []
for k,v in conv_res.document.pages.items():
    image_strings.append(v.image.uri.unicode_string())

Replies: 1 comment

Comment options

You must be logged in to vote
0 replies
Answer selected by AIMPED
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
1 participant