Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

getPrevalent / getPrevalence #659

Open
antagomir opened this issue Nov 20, 2024 · 1 comment
Open

getPrevalent / getPrevalence #659

antagomir opened this issue Nov 20, 2024 · 1 comment

Comments

@antagomir
Copy link
Member

In some applications we like to calculate prevalences per group, and possibly then pick features that are sufficiently prevalent in at least one group. Currently this can be done e.g. with:

library(mia)
data(peerj13075)
tse <- peerj13075

# Split the tse data object by milk stage
tses <- splitOn(tse, group = "Geographical_location")

# Calculate prevalences per group (features x groups table)
prev <- sapply(tses, function (tse) getPrevalence(tse, assay.type="counts"))

mia could possibly simplify this and provide a wrapper like just:

prev <- getPrevalenceByGroup(tse, assay.type="counts", group="Geographical_location")

But perhaps it is simple enough already without? At least we could add an example of this on the manpage of getPrevalence?

@TuomasBorman
Copy link
Contributor

I tend to think that this is easy enough currently. Moreover, with my limited knowledge I think that this is not that commonly done.

However, as you said, we should have examples on this. Instead of manpage, OMA could be also suitable as this affects more widely than just this function. Here we tell about splitOn() (I also noticed typo): https://microbiome.github.io/OMA/docs/devel/pages/wrangling.html#sec-splitting The text says that we can use it for this purpose, but it would benefit to add clear example on how to do that.

Adding getPrevalenceByGroup() might be something to discuss in the future. However, adding new functions that do similar things is not optimal. It might be confusing.

If we decide to implement this, this issue might be relevant: microbiome/miaTime#30. We discussed about the general split-apply-combine function there. However, I still think that improving examples is easier and more sustainable way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants