-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add GDCM dependency to Dockerfile #5
Comments
Agree on both fronts! I haven't looked at this repository in a bit, let me see what I can do. |
It would be great to release a new build to DockerHub! |
I'm eager to compare your progress with a machine learning approach to my approach with tesseract OCR, computer vision, and python. |
Definitely! I'd like to do this proper with a Docker container deployed via CI, so I'll likely take a day or two to do this. I'll post an update (and ping you on the PR) when I have something to test. |
I will definitely test it!
Daniel Snider ツ
…On Wed, Jan 2, 2019 at 6:16 PM Vanessa Sochat ***@***.***> wrote:
Definitely! I'd like to do this proper with a Docker container deployed
via CI, so I'll likely take a day or two to do this. I'll post an update
(and ping you on the PR) when I have something to test.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#5 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABqDWAvix3xCjS1PZndgu2TqrIsczx3Sks5u_T3kgaJpZM4ZnWgY>
.
|
Just a heads up, I'm renaming the repo to "dicom-cleaner" because the scraper doesn't as well describe the intended purpose. The Docker images will be built from:
I don't have Github Actions for this organization yet, but if/when we do the deploy will be much easier than relying on an external service. |
And it would be really fun to test these images and develop new methods! I created the tooling a while back and nobody was super interested in a proper testing, so I'm really happy you are ! :D |
hey @danielsnider I'm still figuring out the CI (customizing the orb to build only on PRs and then push on merge to master) but since the container (with gdcm installed via conda) was erroneously pushed to docker hub, if you want to play around with it, here you go! https://cloud.docker.com/u/pydicom/repository/docker/pydicom/dicom-ocr-cleaner
I should have some time to check up on this later, hopefully we will hear back from Circle and then can test the GDCM as well. |
hey @danielsnider the CI is added, see the images being built and pushed to the docker repos I mentioned above here --> https://circleci.com/gh/pydicom/dicom-cleaner I added the gdcm install to the pydicom/dicom-ocr-cleaner, although I didn't do any testing, etc (the PR was to add the continuous integration). If you'd like to use this container as a start for your testing, with your feedback we can open another PR to work on the software itself. |
Here we go! I'm trying your new container! Can you look into the error below? I've included the python package versions that are in the container too. It looks like pydicom version
Hopefully just a wrong dependency issue on |
Yep this is actually what I expected! The module needs to be updated for the newest release of pydicom. I'll do this in a new PR and we can continue discussion from there. |
Greatness!
… On Jan 3, 2019, at 7:54 PM, Vanessa Sochat ***@***.***> wrote:
Yep this is actually what I expected! The module needs to be updated for the newest release of pydicom. I'll do this in a new PR and we can continue discussion from there.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
hey @danielsnider a PR is open with fixes to the header cleaner, see here, and see the last comment for a pushed container you can test. I'm not sure how well you can test with your images since the header flagging largely depends on the deid recipe (the filter) criteria used. If your images aren't flagged I'd like to ask you to look at deid.dicom's default recipe -> and comment on what combinations of header fields are being missed! https://github.com/pydicom/deid/blob/master/deid/data/deid.dicom It's a terrible strategy, generally, but the problems with ML approaches is that they take much longer to do. The ocr image has a bug #9 that likely needs fixing via updating versions. If you have any insight let me know, the original issue is linked and I just don't have time to work on it now. |
Nice work!! I've made a suggestion for
#9 I'm really interested in
the ML approach.
Because I have too much variety in my dicom image set, I'm trying avoid
using known PHI coordinates as in the deid.dicom's default recipe. However,
I may come back to it.
…On Fri, Jan 4, 2019 at 3:57 PM Vanessa Sochat ***@***.***> wrote:
hey @danielsnider <https://github.com/danielsnider> a PR is open with
fixes to the header cleaner, see here, and see the last comment for a
pushed container you can test.
#8 <#8>
I'm not sure how well you can test with your images since the header
flagging largely depends on the deid recipe (the filter) criteria used. If
your images *aren't* flagged I'd like to ask you to look at deid.dicom's
default recipe -> and comment on what combinations of header fields are
being missed!
https://github.com/pydicom/deid/blob/master/deid/data/deid.dicom It's a
terrible strategy, generally, but the problems with ML approaches is that
they take much longer to do.
The ocr image has a bug #9
<#9> that likely needs
fixing via updating versions. If you have any insight let me know, the
original issue is linked and I just don't have time to work on it now.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#5 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABqDWCIUNH4tNTnxODaZEgiUWs12XnORks5u_8AlgaJpZM4ZnWgY>
.
|
Actually, I will try the header cleaner! If it can clean a percentage of my
images, that's a win! I'll do a quick "does it run" test right now.
Daniel Snider ツ
On Fri, Jan 4, 2019 at 4:17 PM Daniel Snider <danielsnider12@gmail.com>
wrote:
… Nice work!! I've made a suggestion for
#9 I'm really interested
in the ML approach.
Because I have too much variety in my dicom image set, I'm trying avoid
using known PHI coordinates as in the deid.dicom's default recipe. However,
I may come back to it.
On Fri, Jan 4, 2019 at 3:57 PM Vanessa Sochat ***@***.***>
wrote:
> hey @danielsnider <https://github.com/danielsnider> a PR is open with
> fixes to the header cleaner, see here, and see the last comment for a
> pushed container you can test.
>
> #8 <#8>
>
> I'm not sure how well you can test with your images since the header
> flagging largely depends on the deid recipe (the filter) criteria used. If
> your images *aren't* flagged I'd like to ask you to look at deid.dicom's
> default recipe -> and comment on what combinations of header fields are
> being missed!
> https://github.com/pydicom/deid/blob/master/deid/data/deid.dicom It's a
> terrible strategy, generally, but the problems with ML approaches is that
> they take much longer to do.
>
> The ocr image has a bug #9
> <#9> that likely needs
> fixing via updating versions. If you have any insight let me know, the
> original issue is linked and I just don't have time to work on it now.
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <#5 (comment)>,
> or mute the thread
> <https://github.com/notifications/unsubscribe-auth/ABqDWCIUNH4tNTnxODaZEgiUWs12XnORks5u_8AlgaJpZM4ZnWgY>
> .
>
|
I'm on the same page - the "hardcoded" memorize patterns in headers to find pixels just doesn't have feet! The image is rebuilding now - I'll let you know how it goes after testing! |
My DICOM images seem to be compressed and ask for
GDCM
. It would be great to addGDCM
to the ocr Dockerfile :-).This is what I see:
Sidenote: Commenting on line main.py#L157... if you catch all exceptions it would be great to at minimum print the exception instead of hiding it. Not seeing the error is the worst kind of error.
The text was updated successfully, but these errors were encountered: