Skip to content
This repository has been archived by the owner on Dec 18, 2024. It is now read-only.

Title: Question about DPT model performance and network size adjustments #95

Open
JunhyeongDoyle opened this issue Sep 29, 2024 · 3 comments

Comments

@JunhyeongDoyle
Copy link

Hi, first of all, thank you for sharing the code and resources with the community! I’ve been experimenting with the four pretrained models provided in the repository to extract depth maps. While testing, I adjusted the network size parameters (net_h, net_w) and observed that increasing these values seemed to improve the detail in the depth estimation, especially in more complex regions of the images.

However, I have a concern that increasing these values too much might lead to a trade-off where the model focuses too heavily on local features at the cost of global geometric consistency across the image. I would like to know your thoughts on this hypothesis: Could increasing the network size cause a decrease in global geometric coherence?

Additionally, for processing images with a resolution of 1920x1080, I aim to achieve a dense depth map without geometric inconsistencies. Could you recommend which of the four pretrained weights would be best suited for this task? And, based on your experience, what would be an optimal setting for net_h and net_w to balance detail and global consistency?

Thanks again for your help and for providing this fantastic tool!

@kristoftunner
Copy link

@JunhyeongDoyle did you get an answer to the resolution part of your question? How do you create a depth image with 16:9 resolution input without degrading the image quality?

@JunhyeongDoyle
Copy link
Author

@kristoftunner Hi, thanks for reaching out. In conclusion, I haven't found an optimal method yet. When I kept the network size the same and used higher-resolution images with a 16:9 aspect ratio, the network struggled to accurately extract depth information, especially in high-frequency detail areas. Conversely, when I increased the network size to handle the higher resolution, the network seemed to capture the detailed areas better visually, but I felt that the validity or accuracy of the depth measurements decreased.

@kristoftunner
Copy link

thanks for the answer!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants