Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible ambiguous hidden labels in Sex vocabulary #165

Open
marcos-lg opened this issue Jan 2, 2025 · 4 comments
Open

Possible ambiguous hidden labels in Sex vocabulary #165

marcos-lg opened this issue Jan 2, 2025 · 4 comments
Assignees

Comments

@marcos-lg
Copy link
Contributor

There are some warnings in the vocabulary lookup regarding hidden labels that are mapped to more than 1 concept after normalizing the label.

For example, there is a M J hidden label in Male

https://api.gbif.org/v1/vocabularies/Sex/concepts?hiddenLabel=M%20J

Then there is a 2M2J in Mixed:

https://api.gbif.org/v1/vocabularies/Sex/concepts?hiddenLabel=2M2J

And 1M 3J, 1M 1J, 1M 2J and others also mapped to Male, e.g.: https://api.gbif.org/v1/vocabularies/Sex/concepts?hiddenLabel=1M%203J

Doesn't it seem a bit contradictory?

I also found these other special cases:

@marcos-lg marcos-lg changed the title Possible ambiguous hidden label in Sex vocabulary Possible ambiguous hidden labels in Sex vocabulary Jan 2, 2025
@CecSve
Copy link
Collaborator

CecSve commented Jan 6, 2025

Thanks for catching these! They should be corrected, and I will update the mappings.

@CecSve
Copy link
Collaborator

CecSve commented Jan 6, 2025

M F in Mixed and 8M 0F in Male. This might make sense because the latter says 0 females but when we do the lookup of this vocab we remove the numbers so these 2 values are equal. Maybe we should remove numbers from 1 to 9 only and not remove the zero? (there are more cases with zero)

Good point - let's not remove zeros in the lookup.

@ManonGros
Copy link
Collaborator

Wouldn't things like M J be for young male? 2M2J should probably be male.

@CecSve
Copy link
Collaborator

CecSve commented Jan 7, 2025

Wouldn't things like M J be for young male? 2M2J should probably be male.

M J is the value when the value has been stripped for numbers so it used to correspond to for example 1M6J and we should treat those as M. The 2M2J in Mixed is a mistake.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants