-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Coresets API proposal #89
Conversation
class BaseCoreset: | ||
def get_ranks(features: Array, logits: Array) -> Indices: | ||
# K-Means or something | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we have an option for selecting the top n or sth?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In heuristics we do not have that option.
I will add it.
rfcs/02-coresets.md
Outdated
|
||
def filter(logits, heuristic, proportion=1.)-> Indices: | ||
# Use `heuristic` to rank features and return the top `proportion` indices. | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we have a way to give more importance to coreset results or to heuristic results?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure how to merge them.
The idea of this method is to use BALD to create a pool of potential candidates (ie. the top 10%) and select from them with a coreset. I will remove it as we don't know if it works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is a nice proposal. We'll see in action what other changes needed. I'll merge
Did this idea get anywhere in the end? e.g. a weighted balance between coreset selection and some other heuristic (e.g. BALD) ? |
I'd be very tempted to work on it. With the view that we'd be creating a class to combine any two (or more?) strategies with some weighting. Independently of that I think being able to intialize with coreset (and other diversity metrics) with a very simple API would be nice for beginners. e.g. for the minute I have
it would be nice to have
|
Summary:
New coreset API that I would like to implement.
Features:
Related to #67