Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better Figure Segmentation through Edge Detection? #4

Open
DonaldTsang opened this issue Jan 16, 2020 · 9 comments
Open

Better Figure Segmentation through Edge Detection? #4

DonaldTsang opened this issue Jan 16, 2020 · 9 comments

Comments

@DonaldTsang
Copy link

This may sound weird, but is it possible to use edges of regions to redefine figure segmentation to make it more accurate?

@jerryli27
Copy link
Owner

Good point : ) This is how the dataset was created actually. First use edges to separate out different regions, then manually correct the edges, because edge detection can be very inaccurate at times when the background color is similar to the character's color.

@DonaldTsang
Copy link
Author

This might sound weird but that is exactly the questions I raised in another project KichangKim/DeepDanbooru#5
How do you "manually correct" the data? And how does Google Recaptcha do it (if we need to resort to crowdsourcing)?

@jerryli27
Copy link
Owner

First I generated the edges based on the image. I wrote an angular based HTML UI to select the regions based on the edges (using the flood fill algorithm) and if I see any overflow, I modify the edge layer until there no longer is any overfill. Then I just save the masked regions as a separate image -- that serves as the segmentation ground truth. Does that make sense?

@DonaldTsang
Copy link
Author

DonaldTsang commented Jan 19, 2020

I would not say that I can follow completely... is flood fill similar to MSPaint's bucket tool, but instead of overwriting it is selecting regions? Or in other words, it is like the Magic Selection tool of Photoshop with heavy simplification?
I that is exactly what you are doing, how can I replicate such a system at scale for a larger dataset?

@jerryli27
Copy link
Owner

Exactly like the Magic Selection tool. Libraries of efficient implementations can be found online pretty easily.
For a larger dataset, from my experience you will not get high quality data from the untrained crowd. Segmentation is significantly more difficult than captcha . I'd suggest that 1. you get funding for it or work with a company 2. double and triple check if you REALLY need such a big dataset on the order of 10k or 100k labels. Are you doing things just for fun? Can you get away with data augmentation which is much simpler?

@DonaldTsang
Copy link
Author

DonaldTsang commented Jan 19, 2020

@jerryli27 unfortunately if we are doing pure image tagging without regions DeepDanbooru already does that with "questionable"/"great" results, but we want to do more than just that, and image segmentation might provide better insights as to how we can improve image tagging. The current dataset we are using is based on https://www.gwern.net/Danbooru2019 which has no segmentation, and based on the DD results maybe we can leverage it to ease in on generating segmented data.

It is kind of for fun, but I would really hope that this could be part of my future mental exercise.

@jerryli27
Copy link
Owner

Make sense : ) What application are you targeting with segmentation that you cannot do with tagging, if I may ask?

@DonaldTsang
Copy link
Author

DonaldTsang commented Jan 19, 2020

More like segmentation as a means to improve automated or machine-aided tagging through the use of Deep Learning.
And discover patterns within the tagging knowledge graph itself through segmentation structures and overlaps.

@DonaldTsang
Copy link
Author

Technology is kicking in fast KichangKim/DeepDanbooru#5 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants