Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Any way to save the output markdown into folder ? #1

Open
drmetro09 opened this issue Dec 21, 2024 · 9 comments
Open

Any way to save the output markdown into folder ? #1

drmetro09 opened this issue Dec 21, 2024 · 9 comments

Comments

@drmetro09
Copy link

drmetro09 commented Dec 21, 2024

any way to save the output into folder ?

@drmetro09 drmetro09 changed the title Docker compose gpu not working Any way to save the output markdown into folder ? Dec 21, 2024
@xiaoyao9184
Copy link
Owner

You should directly reference the marker project

or use this project to run as a gradio service and use gradio_client.

There will be an API interface at the bottom of each gradio project.

@drmetro09
Copy link
Author

You should directly reference the marker project

or use this project to run as a gradio service and use gradio_client.

There will be an API interface at the bottom of each gradio project.

Is it possible to mount the output folder ?

@drmetro09
Copy link
Author

I just want it to save the markdown file to a folder , how can i do it ?

@xiaoyao9184
Copy link
Owner

you need to call convert_single.py or convert.py

@drmetro09
Copy link
Author

you need to call convert_single.py or convert.py

How to do that from gradio ui ? Can you please explain?

@xiaoyao9184
Copy link
Owner

You can run any commands provided by Marker. However, unless it is a one-time task, it is not recommended to use command scripts for execution, as the model will be reloaded each time. Below is an example of using an offline model to process batch PDF file inputs and output them as MD files.

services:
  marker_pdf_convert:
    image: xiaoyao9184/marker:1.0.0
    container_name: marker_pdf_convert
    command: marker /pdf_in --output_dir /md_out
    environment:
      - TORCH_DEVICE=cuda
      - HF_HUB_OFFLINE=true
      - DETECTOR_MODEL_CHECKPOINT=/root/.cache/huggingface/hub/models--vikp--surya_det3/snapshots/467ee9ec33e6e6c5f73e57dbc1415b14032f5b95
      - LAYOUT_MODEL_CHECKPOINT=/root/.cache/huggingface/hub/models--datalab-to--surya_layout0/snapshots/421ac206a400227ea714d47a405e53ce74374957
      - RECOGNITION_MODEL_CHECKPOINT=/root/.cache/huggingface/hub/models--vikp--surya_rec2/snapshots/6611509b2c3a32c141703ce19adc899d9d0abf41
      - TABLE_REC_MODEL_CHECKPOINT=/root/.cache/huggingface/hub/models--vikp--surya_tablerec/snapshots/8bca165f81e9cee5fb382413eb23175079917d14
      - TEXIFY_MODEL_NAME=/root/.cache/huggingface/hub/models--vikp--texify/snapshots/ce49c1fe10842e78b8be61f9e762b85ac952807d
    volumes:
      - ./../../cache:/root/.cache
      - ./../../pdf_in:/pdf_in
      - ./../../md_out:/md_out
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              capabilities: [ gpu ]

@drmetro09
Copy link
Author

The md output generated is not accurate it has all spelling errors

@xiaoyao9184
Copy link
Owner

You can try to update the latest version. If the marker project itself reports an error, I can't help you.

@drmetro09
Copy link
Author

drmetro09 commented Dec 23, 2024

You can try to update the latest version. If the marker project itself reports an error, I can't help you.

Im using your latest docker image , How to update the environment variables to latest ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants